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1.  INTRODUCTION 


ynnovations  in  sensing  component  technology  are  spurring  new  research  in  the  area  of  target 
signature  measurement  and  exploitation.  Innovations  include  multi-channel  spatially  diverse 
antennas,  sensitive  receivers,  fast  analog-to-digital  converters,  adaptive  transmit  waveforms, 
and  sparse  sampling  approaches.  These  innovations  support  new  signature  infonnation  sensing 
functions  including  calibrated  target  measurements,  feature  processing,  and  inference  based 
decision  algorithms.  The  ability  to  characterize  information  extraction  while  under  the  effects  of 
system  uncertainties  is  critical  to  risk  based  design  methods.  The  use  of  existing  systems  theory 
prototypes  such  as  the  radar  range  equation  is  inadequate  to  fully  characterize  the  flow  of 
information  through  the  sensing  system. 


The  success  of  any  theory  model  in  the  above  context  will  largely  depend  on  its  ability  to 
address  several  challenges;  (1)  ability  to  characterize  information  gain  and  perfonnance  within 
various  stages  of  the  system,  (2)  propagate  the  effects  of  these  uncertainty  sources  acting  on 
individual  components  within  the  system  to  the  predicted  system  performance  measures,  (3) 
effectively  minimize  the  overall  loss  in  the  infonnation  flow  while  trading  costs  associated  with 
component  design,  and  (4)  operate  effectively  within  the  nonlinear  high  dimensional  spaces 
inherent  in  signature  sensor  systems. 


1.1  Case  for  Information  Theoretic 

The  use  of  information  theoretic  principles  affords  several  advantages  in  dealing  with  the 
challenges  in  the  information  sensing  and  exploitation  areas.  First,  infonnation  theory 
prototypes  enable  the  study  of  the  propagating  effects  of  uncertainty  on  system  perfonnance  at 
the  various  points  of  noise  infiltration.  Using  Fano’s  inequality,  the  max  flow  [1]  criteria  bounds 
the  optimal  Bayes  error.  Entropy  and  mutual  information  (MI)  are  analytically  connected  to  the 
probability  of  enor  (Pe)  and  more  generally  the  Neyman  Pearson  criteria  allowing  for  the  rate  of 
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noise  infiltration  to  be  related  to  the  rate  of  entropy  growth  and  ultimately  to  the  rate  of 
degradation  system  performance.  The  information  loss  associated  with  uncertainty  sources  can 
then  be  characterized  in  terms  of  a  confidence  interval  about  the  predicted  system  performance  at 
each  component  of  the  system.  The  data  processing  inequality  affords  a  method  to  detennine 
information  loss  points  and  maximize  infonnation  flow  via  component  trades  within  a  system 
information  loss  budget. 


Second,  the  convexity  of  mutual  infonnation  yields  a  unique  solution  and  enables  rapid 
numerical  convergence  (low  computational  complexity)  to  maximum  MI  configurations  [2],  [3]. 
MI  affords  the  optimization  of  a  scalar  quantity  while  classical  Bayes  likelihood  ratio  techniques 
can  involve  optimizing  on  non-convex  surfaces  over  high  dimensional  signature  spaces.  On  a 
convex  surface,  the  use  of  highly  efficient  search  algorithms  such  as  the  Conjugate  Gradient 
method  will  converge  on  the  order  of  n  operations  (n  dimensional  problem).  While  entropy 
based  methods  operate  non-parametrically  such  that  the  probability  does  not  have  to  be 
estimated,  complicating  factors  can  include  numerical  computation  issues  that  occur  within  high 
dimensional  processes  (Bellman’s  Curse  of  Dimensionality).  It  can  be  shown  however  [4]  that 
computing  the  entropy  of  the  multivariate  sensor  signature  processes  is  also  O(n).  As  a 
consequence  of  the  law  of  large  numbers,  the  asymptotic  equipartition  property  asserts  that  there 
are  large  regions  within  the  entropic  signature  subspace  which  will  never  occur  under  the 
decision  hypotheses  [2].  Thus,  the  information  theoretic  approach  holds  the  potential  to  exploit 
entropy  based  methods  operating  within  this  “typical”  signature  subspace. 


Third,  classical  statistical  pattern  recognition  approaches  use  the  maximum  likelihood  (ML) 
decision  criteria  which  include  only  the  2nd  order  statistics  present  in  the  training  process.  The 
use  of  MI  in  nonlinear  processing  affords  advantages  over  linear  processing  in  that  it  accounts 
for  higher-order  statistics  within  the  design  of  nonlinear  optimal  decision  rules  and  in  the 
optimization  of  features.  Nonlinear  scattering  phenomenon  resulting  from  the  interaction  of 
individual  target  mechanisms  can  also  reduce  the  effectiveness  of  second  order  techniques  [5]  in 
the  optimization  of  diverse  transmit  wavefonns.  The  use  of  MI  as  a  nonlinear  signal  processing 
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method  for  optimizing  wavefonn  design  will  address  this  phenomenon.  It  is  these  inherent 
benefits  that  distinguish  the  infonnation  theoretic  approach  over  traditional  statistical  pattern 
recognition  methods. 


1.2  Historical  Contributions 

The  use  of  information  theory  in  the  area  of  radar  dates  back  to  the  early  1950s.  Woodward 
and  Davies  [6]  and  Woodward  [7]  were  the  first  to  apply  the  infonnation  theoretic  approach  to 
the  analysis  of  radar  soon  after  the  appearance  of  Shannon’s  original  work  [8]  on  infonnation 
theory.  More  recently  Bell  [9]  has  suggested  the  use  of  an  information  theoretic  approach  to  the 
design  of  radar  waveforms.  Dr.  Bell  fonnulated  and  obtained  a  solution  to  the  problem  of 
designing  a  waveform  that  maximized  the  MI  between  the  target  impulse  response  (viewed  as  a 
random  process)  and  the  received  signal.  Recently,  Leshem  et  al.  [10]  extended  Bell’s  work  to 
the  case  of  multiple  extended  targets.  Sowelam  and  Tewfik  [11]  also  used  wavefonn  design  in 
conjunction  with  the  Kullback-Liebler  [12]  criterion  to  distinguish  between  different  target 
classes.  Briles  [13]  applied  rate  distortion  theory  to  analyze  impulse  radar  for  use  in  target 
identification  design  and  perfonnance  prediction.  Horne  and  Malvern  [14]  introduce  a  high 
level  theoretical  framework  to  calculate  the  infonnation  conveyed  by  the  image  of  a  target  based 
on  pixel  values  relative  to  the  modeled  fluctuations  of  these  values.  Principe,  Xu,  Zhao,  and 
Fisher  [15]  present  a  framework  for  learning  based  on  infonnation  theoretic  criteria  and  have 
studied  the  application  of  the  Fano  bound  to  the  ATR  problem  set.  Methods  such  as  the 
maximum  likelihood  test  have  been  used  to  evaluate  radar  signature  processes  for  target 
classification  performance  as  in  the  work  by  O’Sullivan  et  al  [16].  This  framework  proposes 
several  approximations  to  the  Kulback-Leibler  divergence  that  can  be  used  to  estimate  statistical 
distances  compatible  with  pattern  matching  algorithms.  Pasala  and  Malas  [17]  introduce  the  use 
of  MI  as  a  similarity  measure  for  use  in  the  evaluation  of  suitability  of  radar  signature  training 
surrogates.  Recently,  there  has  been  much  interest  in  radars  with  a  new  architecture  referred  to 
as  the  MIMO  (Multiple  Input  Multiple  Output)  radar  [18]  -  [21].  It  is  the  infonnation  theoretic 
approach  that  unifies  the  analysis  of  these  radar  systems. 
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Several  contributions  within  the  body  of  existing  referenced  work  including  [9]  and  [13]  have 
(in  one  fonn  or  another)  presented  the  radar  system  in  tenns  of  a  Markov  Chain  within  a  channel 
configuration  and  characterized  the  information  flow  from  source  to  sink.  Tishby  [22]  has 
developed  the  infonnation  bottleneck  approach,  wherein  rate-distortion  theory,  the  Data 
Processing  Theorem,  and  compression  play  major  roles.  The  max-flow  min-cut  application  to 
the  channel  problem  has  been  studied  to  understand  the  relationship  of  capacity  to  information 
flow  [23].  Ahlswede  and  Yeung  [1]  have  extended  this  analysis  to  network  infonnation  flow 
where  a  single  infonnation  source  is  multicast  to  multiple  destinations.  B.  C.  Geiger  et-al  [24] 
have  studied  the  infonnation  loss  induced  by  static  nonlinearities  within  a  memoryless  nonlinear 
input-output  system  and  conclude  that  a  particular  output  can  result  from  multiple  inputs. 

Merhav  [25]  have  published  a  series  of  works  on  infonnation  measures  with  application  to 
estimation  theory.  While  this  work  is  thematically  similar  to  the  work  herein,  the  author  has 
used  a  different  theoretical  approach  to  the  study  of  information  theory  and  decision  theory. 


The  area  of  uncertainty  modeling  and  sensitivity  analysis  is  wide  ranging  drawing  upon  the 
established  fields  such  as  the  design  of  experiments  [26]  and  classical  engineering  methods  of 
statistics  that  lead  to  uncertainty  measures  [27].  The  subject  of  propagation  of  uncertainty  has 
been  firmly  established  within  traditional  methods  of  Taylor  Series  expansion  and  differential 
calculus  [28],  Recently,  advances  in  large  scale  computer  simulation  have  opened  the  door  to 
modeling  complex  physical  processes  in  lieu  of  expensive  physical  experiments  [29].  The 
methods  by  Saltelli  et  al.  [30]  present  new  Monte-Carlo  Methods  for  the  study  of  sensitivity. 


A  significant  contribution  of  the  research  reported  to  the  existing  body  of  work  is  the 
development  and  demonstration  of  a  systems  theory  model  for  the  study  of  the  effects  of 
uncertainty  on  the  infonnation  flow  within  the  various  components  of  the  sensor  system.  Fano’s 
equality  is  developed  in  a  mathematically  concise  fonn.  The  Fano’s  equality  is  joined  with  the 
data-processing  inequality  to  address  nonlinear  paradigms  in  sensing.  While  not  new  n  the 
literature,  the  unified  use  of  the  Fano  equality  with  the  Data  Processing  inequality  in  a 
Markovian  channel  construct  is  a  center  piece  of  this  work.  An  abbreviated  and  preliminary 
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treatment  of  this  concept  is  presented  in  [31].  Entropy  is  used  to  model  propagating  uncertainty 
within  an  information  channel.  Measures  are  developed  to  identify  infonnation  flow  bottlenecks. 
The  mathematical  framework  for  an  information  theoretic  approach  to  estimation  and  hypothesis 
testing  is  applied  to  a  multidisciplinary  problem  set.  Techniques  for  bounding  asymptotic 
perfonnance  under  sufficient  statistics  are  characterized  and  related  to  phase  transitions  within 
the  typical  set  trajectory  associated  with  sampling  uncertainty.  Nonparametric  performance 
estimations  are  developed  at  various  points  in  the  information  sensing  pipeline.  Minimum 
sampling  requirements  for  performance  prediction  are  developed  on  high  dimensional  signature 
processes  based  on  low  sample  entropic  estimates. 
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2.  Approach 


2.1  Systems  Theory  Model 

Taking  an  infonnation  theoretic  view,  degrading  effects  are  considered  as  sources  of  entropy. 
Treating  the  system  as  an  infonnation  flow  pipeline  from  input  to  output,  the  injective  entropy 
acts  to  degrade  the  Shannon  MI  between  the  input  and  output.  The  systems  model  is 
demonstrated  as  a  suitable  vehicle  for  performing  component  level  design  trades  within  the 
information  sensing  application  based  on  a  component  level  infonnation  loss  budget  (Bits). 
Demonstration  of  the  max  flow  in  conjunction  with  the  Data  Processing  Inequality  provides 
analysis  of  “bottlenecks”  in  the  infonnation  flow  pipeline.  Key  attributes  of  the  theoretical 
model  have  been  demonstrated  under  the  constraints  of  a  radar  high  range  resolution  sensor 
system  example.  Modeling  and  simulation  for  simplified  target  scattering  models  are  used  to 
illustrate  the  value  of  component  level  analysis  under  the  propagating  effects  of  various  sources 
of  uncertainty. 


2.2  Addressing  Dimensionality 

Interdependencies  among  multivariate  target  signatures  can  significantly  impede  infonnation 
extraction.  The  number  of  samples  required  to  estimate  the  underlying  signature  statistics  is 
related  to  the  incremental  increases  in  uncertainty.  Baseline  statistical  sample  requirements  (in 
the  native  coordinate  system)  associated  with  the  resolved  radio  frequency  target  scattering  are 
characterized  for  specified  states  of  certainty.  Methods  are  developed  that  estimate  the  sampling 
requirements  for  entropic  quantities  based  on  a  characterization  of  the  typical  set  underlying  the 
sufficient  statistics  of  the  random  signature  process.  The  variance  of  the  performance  estimate 
associated  with  low  sample  count  Monte-Carlo  experiments  can  be  scaled  (via  central  limit 
theorem)  to  the  estimate  the  perfonnance  variance  associated  with  higher  sample  counts. 
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2.3  Uncertainty  Analysis 


Both  sensor  uncertainty  and  model  training  uncertainty  are  propagated  into  the  classifier  where 
uncertain  decisions  are  inferred  from  uncertain  observations.  The  uncertainty  (increase  in 
entropy)  is  ultimately  realized  in  the  form  of  confidence  or  reliability  intervals  about  the 
estimated  system  performance.  Mathematically  defined  categories  of  uncertainty  are  developed 
to  better  understand  the  entropic  effects  within  the  infonnation  sensing  system.  A  sensitivity 
analysis  is  performed  to  study  the  relative  significance  of  various  “unknown”  operating 
conditions  to  the  reliability  of  the  perfonnance  estimate  at  each  component  of  the  system.  The 
effects  of  sampling  uncertainty  are  contrasted  to  reliability  of  perfonnance  estimates.  This 
comparison  forms  the  basis  for  the  study  of  the  variance  effects  in  performance  estimation  within 
high  dimensional  signature  processes  subject  to  unknown  operating  conditions. 
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3.  THEORY 


3.1  Uncertainty  Sources  within  an  Information  Sensing  System  and  the  Decision 
Rule  Subspace 

It  is  important  to  contrast  the  proposed  concept  of  uncertainty  with  several  terms  generally 
used  by  the  measurement  community  [32].  Accuracy  refers  to  the  agreement  between  a 
measurement  and  the  true  or  correct  value.  Precision  refers  to  the  repeatability  of  a 
measurement.  Error  refers  to  the  disagreement  between  a  measurement  and  the  true  or  accepted 
value.  The  uncertainty  in  a  stated  measurement  is  the  interval  of  confidence  around  the 
measured  value  such  that  the  measured  value  is  expected  not  to  lie  outside  this  stated  interval. 
The  use  of  the  term  “uncertainty”  implies  that  the  true  value  may  not  be  known  and  can  be  stated 
along  with  a  probability.  These  definitions  recognize  the  detenninistic  nature  of  error  and  the 
stochastic  nature  of  uncertainty.  However,  uncertainty  as  currently  defined  by  the  sensor 
measurement  community  may  not  be  sufficient  to  address  the  full  range  of  issues  under  study 
within  an  infonnation  sensing  system. 


Modem  sensing  systems  produce  signature  measurements  that  when  combined  with  the  effects 
of  various  system  uncertainties  are  realized  as  a  random  signature  process.  Conclusions  are 
inferred  by  applying  instances  taken  from  this  random  measured  signature  process  to  a  decision 
rule.  The  “unknowable”  nature  of  parameters  affecting  the  measured  signature  process  leads  to 
challenges  in  developing  a  signature  process  model  that  will  generate  an  optimal  decision  rule 
for  inferring  information.  The  combined  effects  of  these  sensing  and  training  uncertainties  limit 
the  exploitation  of  physics-based  features  and  result  in  a  loss  in  information  that  can  be  extracted 
from  target  signature  measurements.  The  resulting  decision  uncertainty  is  then  driven  by  both 
the  distorted  measurements  and  the  degree  of  agreement  between  the  signature  process  under 
measurement  and  the  process  used  to  train  the  optimal  decision  rule. 
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3.2  Sensing  Uncertainty  Example 


It  is  helpful  to  illustrate  the  concepts  surrounding  uncertainty  through  the  use  of  a  real  world 
example.  A  related  problem  of  interest  is  the  measurement  of  airborne  moving  objects  using 
high  range  resolution  (HRR)  waveforms.  The  successful  extraction  of  infonnation  from  these 
measurements  via  a  specific  system  design  is  complicated  by  several  sources  of  uncertainty.  The 
two  general  classes  of  system  uncertainty  introduced  above  are  given  in  Table  I  as  “Sensing” 
uncertainty  and  uncertainty  resulting  from  “Decision  Rule  Training  Limitations”.  Sensing 
uncertainty  is  divided  into  three  subcategories  (a)  signature  measurement  uncertainty  due  to 
sensor  design/limitations,  (b)  the  uncertainty  due  to  interference,  and  (c)  object  tracking  position 
and  motion  uncertainty. 


The  object  under  measurement  by  the  sensing  radar  system  can  be  viewed  as  a  collection  of 
scattered  field  sources  filling  an  electrically  large  volume  in  space.  The  system  measurement  of 
this  object  is  subject  to  uncertainty  identified  in  source  l.a  generating  the  statistical  support 
underlying  a  random  signature  process  at  a  fixed  position  in  time.  Target  fixed  body  motion 
within  the  measurement  interval  induces  scintillation  within  the  scattering  sources  resulting  in  an 
additional  increase  in  entropy.  Imperfect  knowledge  of  target  position,  velocity,  and  aspect  also 
alters  the  statistical  characterization  of  the  random  signature  process  (source  l.c).  The  random 
signature  process  also  interacts  with  an  external  environment  (source  Lb)  to  further  impact  the 
statistical  nature  of  the  measured  signature  process. 
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Table  1 

Radar  System  Uncertainty  Sources 


Uncertainty  Core 
Area 

Parameter  Uncertainty  Subcategory 

1.  Sensing 

Non-linear 

Effects 

I&Q 

Quantization/ 

Clipping 

Amplitude  & 
Phase 
Calibration 

a.  Signature 
Measurement* 

b.Environmental 

Clutter/Therm 
al  Noise 

RF 

Interference 

Jamming 

c.  Object 

Tracking  & 

Motion 

Object  Range, 
Velocity,  & 
Aspect 
Estimates 

Object 

Articulation 

Intra¬ 

measurement 

Motion 

2.  Decision  Rule 

Training 

Limitations 

Process  Under 
Sampling* 

Target 

Configuration 

Variation 

Target 

Modeling 

Parameters 

*  Epistemic  Uncertainty 


The  exploitation  of  this  signature  process  using  a  decision  algorithm  requires  the  training1  of 
an  optimal  decision  rule  that  operates  within  the  entropy  produced  by  sources  1  .a,  1  .b,  and  1  .c. 
Only  a  subset  of  the  phenomenon  (parameters)  underlying  source  1  can  be  modeled  and/or 
characterized  within  the  statistical  decision  rule  training  process.  Limitations  within  the  training 
process  result  in  a  decision  rule  design  that  is  less  than  optimal  with  respect  to  system 
performance. 


Sources  of  uncertainty  that  arise  because  of  natural,  unpredictable  variation  within  the  system 
under  are  aleatoric  and  are  considered  “unknowable”.  Source  1  uncertainties  are  aleatoric  in 
nature  and  as  such  can  only  be  characterized  statistically.  Uncertainties  that  are  conceptually 
resolvable  yet  subject  to  systematic  limitations  are  epistemic  and  are  to  be  reduced  as  much  as 
possible  within  the  analysis.  The  performance  resulting  solely  from  signature  measurement 
uncertainty  source  1  .a  (highest  system  certainty  state)  is  defined  here  as  epistemic  in  nature  and 
in  general  can  be  modeled  and  characterized.  Sources  1  .b,  1  .c,  and  2  are  in  general  aleatoric  and 
can  result  in  a  reduction  in  certainty  from  the  highest  certainty  state. 


1  Supervised  learning  assumed  as  the  classification  training  approach. 
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3.3  Uncertainty  and  the  Decision  Rule  Subspace 


The  sources  of  uncertainty  associated  with  source  2  in  Table  I  are  traceable  to  the 
corresponding  effects  within  the  decision  rule  subspace  in  the  classical  statistical  pattern 
recognition  approach  to  the  binary  hypothesis  testing.  The  decision  rule  design  (decision 
threshold  d)  is  based  on  the  statistical  training  support  resulting  from  the  uncertainties  in  Table  I. 
If  the  sensing  uncertainties  within  source  1  are  adequately  represented  in  the  statistics  of  the 
training  process,  the  decision  rule  design  should  provide  optimal  performance.  The  effects  due 
to  many  of  the  uncertainties  in  Table  I  are  unavoidable.  For  example,  target  signature 
realizations  are  often  formed  through  the  integration  of  many  sequential  measurements.  Intra¬ 
measurement  object  motion  can  cause  distortion  and  induce  uncertainty  in  the  decision  rule 
subspace  that  is  not  accounted  for  in  the  decision  rule  training  process.  In  another  example,  the 
object  under  measurement  may  be  configured  differently  than  that  represented  in  the  training 
data  (extra  fuel  tanks,  wing  flaps  up,  or  damaged  surface  for  example). 

3.4 Information  Theoretic  Decision  Rule  Subspace 

An  alternative  to  the  classic  statistical  pattern  recognition  approach  to  viewing  the  decision 
rule  subspace  is  shown  in  Fig.  1.  In  Fig.  1,  the  decision  rule  subspace  is  cast  in  terms  of 
infonnation  theoretic  quantities  based  on  entropy;  a  measure  of  the  size  of  a  typical  set  [2].  In 
Fig.  1,  infonnation  is  defined  in  terms  of  the  MI  between  the  “typical  subspaces”  [2]  associated 
with  the  true  object  state  H  and  the  decision  state  Q  where  H  and  Q  are  discrete  random 
variables.  Systems  (and  associated  sub-component)  designs  that  increase  the  MI  between  these 
“typical  signature  subspaces”  increase  the  flow  of  information.  “Uncertainty”  acts  to  alter  to  the 
typical  signature  subspaces  (growth  or  movement)  associated  with  the  highest  certainty  state.  A 
change  to  the  typical  subspaces  can  result  in  a  loss  in  the  flow  of  information  and  a  decrease  in 
decision  performance. 
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W(H,Q) 


Fig.  1.  Decision  Rule  Subspace  and  the  Overlapping  Typical  Sets 


*Modified  version  of  Figure  2.2,  [2] 


3.5 Information  Theoretic  Radar  Channel  Model 

The  concept  of  uncertainty  introduced  in  Fig.  1  can  be  realized  in  terms  of  an  increase  in 
entropy  within  a  discreet  memoryless  information  channel.  The  radar  information  sensing 
system  can  be  viewed  within  this  channel  model  depicting  the  infonnation  flow  through  the 
signature  sensing  and  processing  components  of  a  radar  system  as  shown  in  Fig.  2  [2],  [33]. 


SensingWaveform  Data  Projection  rf 

f  Decision  Decision 


Thermal  Sensing 
Noise  Uncertainty 


Training 

Uncertainty 


Fig.  2.  Information  Theoretic  Sensor  Channel  Model 


The  relationship  between  H  and  Q  is  the  basis  chosen  for  performance  characterization.  The 
discrete  random  variable  H  represents  which  of  the  Nc  possible  hypotheses  has  occurred.  Q  is  of 
the  same  alphabet  as  H.  Successful  flow  of  information  results  in  agreement  between  H  and  Q. 
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Conditioned  on  the  generating  hypothesis  H  (instance  of  H),  there  is  typically  a 
multidimensional  encoded  source2  XE  which  is  realized  as  the  image  projection  of  the  scattered 
field  of  the  object  under  measurement.  In  the  case  of  an  HRR  radar  measurement,  this  is  the 
band-limited  frequency  response  associated  with  the  scattered  field  of  the  observed  object  in 
thennal  noise.  After  mixing,  filtering,  and  signal  processing,  these  returns  become  the  measured 

random  signature  vector  X(I .  The  sensing  of  Xn  is  subjected  to  the  uncertainties  listed  in  source 

1  of  Table  I  leading  to  the  random  signature  process  X .  The  various  cases  of  the  sensed  signature 
are  summarized  in  Table  II. 


Table  II 

Sensor  Measurement  Signal  Cases 


Encoded  Deterministic  Multivariate  Signal 

XE 

Deterministic  Signal  in  Additive  Noise 

X„=XE+fi 

Random  Multivariate  Signal  in  Additive  Noise 

X  =  X£+n 

The  multivariate  sample  feature  Y1  is  extracted  from  the  ith  instance  test  sample  X1  to  support 
the  desired  function  of  the  exploitation  system.  Given  the  random  nature  of  X ,  the  extracted 
signature  feature  Y  is  also  random.  The  training  feature  process  Y'  is  developed  from  the  set  of 
typical  signatures  within  the  decision  rule  training  process  X'.  X'  (and  thusY')  is  developed 
offline  using  a  surrogate  process  and  is  used  to  detennine  the  ‘optimal’  decision  rule  d.  The 
decision  algorithm  applies  Y1  to  the  decision  rule  d  yielding  the  decision  Q  (instance  of  Q) 
declaring  which  of  the  hypothesis  has  occurred.  The  evaluation  of  an  ensemble  Nm  of  test 

samples  X1  {i=l  -»  Nm}  produces  the  sample  ensemble  of  the  Nm  matching  tests  (H,Q)  to 
statistically  characterize  the  decision  performance. 


2  XEin  this  context  is  detenninistic  resulting  from  the  convolution  of  the  target’s  physical  scattering  mechanisms  with  the 
transmitted  waveform  s.  Given  the  “unknowable”  nature  of  this  code  through  measurement  or  modeling,  the  code  itself  is  only 
observable  in  the  random  form  ofX  . 
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3.6Fano  Based  Information  Theoretic  Method  (FBIT) 


It  is  desired  to  quantify  the  effects  of  “uncertainty”  and  the  associated  alteration  to  the  typical 
signature  subspaces  in  terms  of  the  flow  of  information  and  the  impact  to  system  perfonnance. 
Two  theorems  from  information  theory  play  key  roles  in  the  development  of  these  relationships. 
Fano’s  Inequality  relates  information  theoretic  quantities  to  the  Pe  criterion  for  an  object 
classification  system  [2],  The  Data  Processing  Inequality  [2]  affords  the  analysis  of  the  flow  of 
infonnation  from  measured  object  returns  through  the  signature  sensing,  signal  processing 
architecture,  and  into  the  decision  stage;  detailing  where  information  is  lost,  and  quantifying  the 
impact  on  system  performance.  In  this  manner,  stages  in  the  infonnation  processing  pipeline 
where  information  is  lost  can  be  identified,  analyzed  and  optimized,  leading  to  improvement  in 
overall  system  performance. 


Fano’s  inequality  provides  a  mathematical  means  to  relate  the  MI  between  H  and  Q,  I(H;Q),  to 
a  lower  bound  on  Pe  .  Fano’s  inequality  [2]  can  be  written  as  an  equality  as  in  (1). 

H(Pe)  =  S-Pe  log(Nc-l)+H(H/Q)  (1) 

In  (1),  Pe  is  a  real  random  variable  between  0  and  .5  representing  the  probability  of  error  of  the 
decision  algorithm.  Nc  is  the  discrete  size  of  the  alphabet  of  H  and  Q.  H(H)  is  the  Shannon 
entropy  of  the  discrete  random  variable  H.  8  is  a  bias  offset  derived  from  asymmetries  in  the 
data  and  decision  algorithm  [32],  Typically  8  is  small  and  to  a  first  approximation  may  be 
neglected. 


Theorem  I:  For  Nc  =  2,  Fano’s  equality  can  be  written  as  H(Pe)  =  1-  I(H;Q)+  I(Q;V)  where 
V  is  the  binary  discrete  random  variable  representing  the  probability  that  the  decision  rule 
makes  a  correct  decision. 


Proof  Sketch:  Using  I(H;Q)  =  H(H)  -  H(H/Q)  and  (1)  we  get  (2)  below. 
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H(P e)  =  5  -  Pe  log(Nc  -1)+  H(H)  -  I(H;Q) 


(2) 


The  asymmetry  factor  in  (2)  can  be  computed  directly  from  the  output  of  the  decision 
algorithm.  Let  s  =  I(Q;V)  for  Nc  =  2;  where  V  is  the  binary  discrete  random  variable 
representing  the  probability  that  the  decision  rule  makes  a  correct  decision.  V  =  1  when  H 
=  Q;  otherwise  V  =  0.  Equation  (2)  can  then  be  written  more  completely  for  Nc  =  2  as  in 
(3)  below  [32]. 

H(Pe)=l-I(H;Q)+I(Q;V)  (3) 


Equation  (3)  can  be  written  in  terms  of  the  inverse  entropy  function  F  as  shown  in  (4). 


Pe  =  F(H(H)-I(H;Q)  +  I(Q;V) 


(4) 


F  is  a  deterministic  strictly  monotonically  increasing  function  that  maps  information  theoretic 
quantities  into  the  Pe  at  the  corresponding  operating  point.  The  relationship  of  Pe  to  F(x)  where 
x  e  [0,  Vi\  is  given  Fig.  3. 


Fig.  3.  The  Binary  Entropy  Function,  Bits 


The  quantity  ILq  in  (5)  is  the  end  to  end  information  loss  for  the  system. 
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ILq  =  H(H)  -  I(H;Q) 


(5) 


Minimizing  the  information  loss  minimizes  the  system  Pe. 

The  entropic  quantity  H(H)  is  determined  by  the  a  priori  probabilities  of  the  outcomes  of  the 
random  variable  H  corresponding  to  the  different  target  classes.  8  is  fixed  by  architectural 
considerations.  Since  F  is  a  known  function,  the  deterministic  relation  Pe  =  F(H(H)-I(H;Q)  + 
I(Q;V) ),  for  fixed  H(H)  and  8,  determines  the  MI  (I(H;Q))  needed  to  achieve  a  specified  Pe. 

For  example,  for  an  equiprobable  binary  hypothesis  scenario3,  H(H)  =  1  Bit  and  I(Q;V)  «  0,  an 
approximation  for  Pe  can  be  written  as 

Pe  ~  F(1  -  I(H;Q)).  (6) 

Specifying  a  desired  Pe  determines  the  amount  of  allowed  ILq.  How  the  ILq  budget  is  “spent”  as 
information  cascades  from  the  input  space  at  H  to  the  classifier  output  space  at  Q  can  be  traded 
off  via  component  (link)  design.  Fig.  4  presents  an  abstract  diagram  indicating  possible 
tradeoffs.  Information  losses  within  the  channel  can  be  studied  with  respect  to  various  sources  of 
uncertainty  in  Table  I. 


Fig.  4.  Example  of  Trading  System  Component  Level 
Design  for  Information  in  Bits  [33] 


3  The  selection  of  the  uniform  prior  on  H  is  for  illustration  purposes  and  without  loss  of  generality. 
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The  Data  Processing  Inequality  states  that  information  can  only  be  lost  in  the  channel  as  shown 
in  (7). 


I(H;X)  >  I(H;  Y )  >  I(H;Q)  (7) 

Using  the  relationship  in  (5)  and  (4)  the  loss  associated  with  each  link  within  the  channel  can  be 
characterized  as  in  (8). 

H(H)  -  I(H;X)  <  H(H)  -  I(H;Y  )  <  H(H)  -  I(H;Q).  (8) 

The  cumulative  information  loss  at  each  link  in  the  channel  can  then  be  written  as  below 
applying  (5). 

I  Lx  =  H(H)  -  I(H;x);  xe|z|  (9.  a) 

IL  y  =  H(H)  -  I(H;  Y );  y  e\y\  (9.b) 

ILq  =  H(H)  -  I(H;Q);  (9c) 

The  respective  information  loss  due  to  each  link  within  the  Markov  chain  H  — »  X  — »  Y  — >  Q  can 
then  be  defined  as  in  equations  (10a- 10c). 

Loss  due  to  Sensing  =  II_sa=  H(H)-I(H;x)  (10. a) 

Loss  due  to  Feature  Extraction  =  ILFA=  I(H;  x)  -  I(H;y)  (10. b) 

Loss  due  to  Decision  Rule  =  ILDA=  I(H;  Y)  -  I(H;Q)  (10. c) 

Thus,  the  probability  of  error  can  be  estimated  at  various  points  in  the  channel  using  the 
approximation  in  (6); 

Pex  *  F(H(H)-I(H;X)  (1  l.a) 

PeY  «  F(H(H) - I(H; Y)  (ll.b) 
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PeQ  *  F(H(H)  -  I(H;Q). 


(ll.c) 


3.7  Uncertainty  in  the  Information  Channel 

From  (2)  we  see  that  sources  of  uncertainty  introduced  in  the  channel  can  result  in  a  reduction 
in  I(H;Q)  and  subsequently  an  increase  in  Pe.  A  decrease  in  I(H;Q)  is  always  accompanied  by 
an  increase  in  H(Pe)  resulting  in  a  degradation  to  the  realized  Pe  . 


3.7.1  Categories  of  Uncertainty 

It  is  important  to  distinguish  between  the  two  distinct  categories  of  uncertainty  in  Table  I.  The 
source  1  uncertainty  results  from  sensing  while  the  source  2  uncertainty  results  from  decision 
rule  training  limitations 


3.7.1 .1  Loss  Due  to  Sensor  Process 

The  loss  at  X  ,  ILsa  ,  is  due  solely  to  the  sensing  process.  The  sensing  uncertainty  inherently 
alters  the  statistics  associated  with  X  generating  statistical  independence  between  X  and  X  thus 
degrading  the  performance  of  the  signature  sensing  process  as  quantified  by  Pe  in  (11a).  The 
loss  in  information  due  to  the  sensor  uncertainty  is  then  realized  at  X  as  ILsa  in  (1  la)  and  is 

\s 

quantified  by  the  entropy  H(Pe  ). 


H(Pex)«  H(H)  -  I(H;X) 


(12) 


3. 7. 1.2  Decision  Uncertainty  Loss 

The  level  of  statistical  agreement  between  X  and  X'  will  directly  affect  the  loss  in  the  channel 
due  solely  to  the  decision  process.  The  sensing  uncertainty  sources  in  Table  I  are  to  some  degree 
reproducible  in  the  decision  rule  training  process  X’ .  However,  sources  1  (b)  and  1  (c)  in  Table  I 
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are  not  fully  reproducible  inX’ .  The  dissimilarity  between  X  and  X'  results  in  a  decision  rule  d 
that  is  not  optimal.  The  application  of  d  to  the  feature  process  Y  induces  a  loss  in  the  channel 
due  to  imperfect  training. 


The  effects  of  decision  uncertainty  within  the  decision  rule  subspace  are  realized  at  Q  as  ILda 
as  illustrated  in  Fig.  2.  The  decision  uncertainty  ILda  can  be  interpreted  in  terms  of  the  entropy 
H(PeQ)  as  in  (13)  and  quantified  as  defined  in  (1  l.c). 


H(PeQ)*  H(H;Q)  -  H(Q). 


(13) 


3. 7. 1.3  Training  Uncertainty  Loss 

The  feature  extraction/ and  decision  rule  d  in  Fig.  2  are  designed  to  maximize  I(H;Q).  The 
resulting  H(PeQ)  provides  the  best  possible  performance  for  a  given  component  design  (radar 
sensor  design,  feature  selection,  algorithm  design,  and  decision  rule  design).  As  stated  above,  X 
is  often  not  completely  observable,  and  a  training  surrogate  X'  is  used  to  develop /  and  d.  Under 
conditions  such  as  those  listed  in  uncertainty  source  2  in  Table  I,  the  surrogate  representation  X' 
used  in  the  training  of  the  decision  rule  results  in  a  non-optimal  d.  This  is  represented  by  the 
altered  entropic  quantity  H(Q’)  and  more  importantly  I(H;Q’).  The  alternate  Markov  chain  H  -> 
X^Y'^Q’  is  shown  as  the  dotted  subspace  H(Q’)  in  Fig.  1 .  The  corresponding  form  of  (3)  can 
then  be  written  as 

H(Pe')=  1-  I(H;Q’)+  I(Q’;V) 


Therefore  since  H(pe)  >  H(pe ), 


I(H;Q’)  -  I(Q’;V)  <  I(H;Q)  -  I(Q;V). 
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Corollary  I:  Information  loss  due  to  imperfect  training,  IL  ta,  is  then  mathematically 
quantified  in  terms  of  the  increase  in  entropy  AH(pe )  resulting  from  a  non-optimal  design  of / 
and  d. 


ILTA=AH(Pe)  =  H(P;)-H(Pe) 


(14) 


=  -I(H;Q’)  +  I(Q’;V)  +  I(H;Q)-  I(Q;V) 


If  it  can  be  shown  that  I(Q;V)  =  I(Q’;V)  and  that 


I(Q;V)  «  H(H)  -  I(H;Q)  and  I(Q’;V)  «  H(H)  -  I(H;Q’) 


then; 


Imperfect  Training  Loss  =  ILTa  =  I(H;Q)  -  I(H;Q’).  (15) 


The  decrease  in  infonnation  flow  due  to  imperfect  training  is  illustrated  in  Fig.  1  as  the 
reduction  in  overlap  between  the  subspaces  of  H  and  Q. 


Definition  I: 

The  total  loss  in  the  channel  is  equal  to  the  sum  of  all  link  loss  components. 
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ILlotal  -  ILsa  +  11-FA  +  ILdA  +  ILtA 


(16) 


Definition  II:  Any  phenomenon  producing  a  increase  in  I(H;Q)  and  a  subsequent 
reduction  in  H(Pe)  can  be  defined  as  a  “ system  information  gain  ”  within  the  information 
channel.  Any  phenomenon  producing  a  decrease  in  I(H;Q)  resulting  in  an  increase  in 
H(Pe)  is  defined  as  a  “ system  information  loss”. 

3.7.2  Propagating  Effects  of  Uncertainty 

Uncertainty  propagation  is  the  study  of  how  uncertainty  in  the  output  of  a  model  (numerical  or 
otherwise)  can  be  apportioned  to  different  sources  of  uncertainty  in  the  model  inputs  [30].  Fig.  5 
provides  an  illustration  of  a  modeling  and  analysis  approach  to  uncertainty  propagation  within 
the  sensitivity  analysis  and  modeling  of  an  information  sensing  system  [30], 
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Fig.  5.  Parametric  Bootstrap  Uncertainty  and  Sensitivity  Analysis 


The  careful  definition  of  variables  plays  a  central  role  in  case  controlled  studies  of  the  effects 
of  uncertainty  on  system  performance.  The  vector  vc  represents  the  control  parameters  of 
interest  within  computer  generated  experiments.  Absent  the  uncertainties  identified  in  Table  I, 
the  effects  of  selected  values  for  von  the  deterministic  mapping  function  Pe  (vc)  in  (1  l.a)  are 

certain.  Further  experimentation  involving  the  unknowable  random  environmental  (VE)  and 
position  (Vt)  estimation  effects  in  sensing  are  best  studied  statistically.  Thus,  the  respective 
estimated  random  input  parameters  of  VE  and  Vt  are  introduced  resulting  in  the  mapping  to  the 
random  signature  process  X  (  vc ,  VE  ,  Vt ).  The  sensing  uncertainty  is  then  subsequently  propagated 
into  the  random  feature  process  Y  (  vc ,  VE  ,  Vt )  and  ultimately  to  the  decision  process  Q(  vc ,  vE,v,). 
For  brevity,  Y  (  vc ,  VE  ,  Vt )  is  written  as  Y  and  Q(  vc,VE,Vt)  written  as  Q. 


The  distributions  associated  with  the  input  parameters  in  VE  and  Vt  are  estimated  from 
experimental  data.  The  estimated  parameters  become  factors  within  a  Monte  Carlo  simulation. 
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The  cumulative  link  information  loss  as  quantified  within  (5)  and  defined  in  (9a),  (9b),  (9c),  then 
become  random  variables  as  shown  below. 


ILX  ~  H(H)  -  I(H;  X  (vc ,  VE  ,  Vr)); 

(17.a) 

ILy  ~  H(H)  -  I(H;  Y  (vc ,  VE  ,  Vt)); 

(17  .b) 

ILq  -  H(H)-I(H;Q  ( %,VE,Vt )) 

(17.c) 

Similarly,  the  link  information  loss  II_sa,  ILfa,  and  II_da  in  (10. a),  (10. b)  and  (10. c)  also  become 
random  variables. 


The  unknowable  characteristics  of  the  observed  signature  process  X  are  realized  within  the 
input  variables  to  the  modeled  training  process  X'  (  vc ' ,  VE' ,  vt ' ).  If  we  assume  that  VfVvf, 

VE '  5*  VE ,  Vt '  Vt ,  then  the  mapping  to  the  non-optimal  decision  rule  will  be  d(  vc ' ,  VE ' ,  V' ) 
which  will  be  written  as  d  for  brevity.  The  decision  rule  d  is  applied  to  Y  ( vc  ,VE,Vt)  generating 
Q’(vc',^-,vf)  ,  written  as  Q’,  while  the  optimal  decision  rule  dopt  generates  Q(  vc ,  VE  ,  Vt ).  Each 

realization  of  d  and  dopt  resulting  from  each  ensemble  X'  (  vc ' ,  VE ' ,  Vt ' )  and  X  ( v, ,  VE  ,  Vt ) 

respectively  in  the  Monte-Carlo  simulation  will  result  in  the  randomization  of  the  imperfect 
training  loss  function  in  equation  (18) 


ILta  =  I(H;Q)  -I(H;Q’)  (18) 


and  the  randomization  of  the  cumulative  loss  function  in  (19) 

23 


Approved  for  public  release;  distribution  unlimited. 


ILq.  *  H(H)  -  I(H;Q’) 


(19) 


In  ( 1 9),  the  special  case  of  VE'=  VE  and  V'  =  Vt,  the  loss  due  to  the  optimal  training  of  d  =  dopt 
yields  ILTa  =  0  and  thus  ILq-  =  ILq.  To  narrow  the  focus  of  analysis,  the  training  space  (  vc' , 
VE\  V' )  will  be  considered  fixed  and  thus  will  become  a  component  of  the  system  control 
parameter  vc .  Therefore  d  becomes  fixed  by  design  as  d. 


3.7.2. 1  Independent  Sources  of  Uncertainty  Loss 

Loss  due  to  isolated  sources  of  uncertainty  within  the  channel  can  be  computed  to  provide  a 
means  to  characterize  the  relative  impacts  to  information  flow  at  various  points  in  the  channel. 
The  various  sources  of  sensing  uncertainty  induce  information  loss  in  the  channel  as 
characterized  by  the  random  link  loss  functions  ILsa,  ILFa?  ILda,  and  ILTa-  The  prior 

distributions  on  the  random  parameters  within  VE  and  Vt  are  propagated  to  the  respective  loss 
functions  using  Monte  Carlo  simulation. 


Definition  III:  The  expected  value  of  the  link  infonnation  loss  can  be  written  as  the 
expected  values  of  the  individual  random  loss  components  as  in  (20. a)  -  (20. c). 


/4lS4  E  {  ILsa} 

(20. a) 

/V  =  E  {  ILfa} 

rA 

(20. b) 

/4lDa  -  E  {  ILda} 

(20. c) 

/4lTa  =  E  {  H-ta} 

(20. d) 

The  mean  total  channel  loss  given  (21)  follows  from  the  linearity  of  the  expectation  operation 
and  the  additive  relationship  between  the  formerly  deterministic  quantities  in  (20.a-20.d). 
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'“"-Total  /7ILSa  +  AlFa  +  A|LDa  +  A|LTa  ' 


(21) 


The  sensing  uncertainty  factors  within  VE  and  Vt  are  assumed  to  be  independent.  Given  that  the 
total  loss  function  ILTotal  can  account  for  multiple  independent  sources  of  uncertainty  within  the 
parameter  space  of  (vc,VE,Vt),  the  variance  on  ILTotai  is  the  sum  of  the  individual  variances 
within  the  components  of  ILTotal. 


Corollary  II:  Assuming  ne  factors  within  VE  and  rit  factors  within  Vt,  the  link  loss 
variance  can  be  decomposed  as  given  in  (22a),  (22b),  (22c)  and  (22d). 


ILsa  "-s, 


2  -I-  2  -I-  2  -I-  2 

'  ...O’  '  CJ  '  . .  .(7 


•A(Vei)  sA(Vn  ) 


ilSa/\/  >  ILSa 


cr  -cr2  +...CT  +cr2  +...cr2 

"-Fa  ILfA(Vei)  pA(Vne)  ,LFA(Vt1)  ILFA(Vn.  ) 


2  =  2  +  2  4-2  4-  2 

(7  a  '...cr  '  cr  '...cr 


,lDa  ilda, 


DA(Vjje )  DA(Vtl) 


cr2  -cr2  +...cr2  +cr2  +...cr2 

"-Ta  ILtA(VE1)  ILT4(vn  ,  ILT4(Vt1)  ILxA(Vnt) 


(22.  a) 

(22.  b) 

(22.  c) 

(22.  d) 


Definition  IV:  The  expected  value  of  the  cumulative  link  information  loss  can  then  be 
written  as  the  expected  values  of  the  individual  random  cumulative  loss  components  as  in 
(23. a)  -  (23. c). 


^ilx=e{|Lx} 
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/4ly  E  {  IL-y  } 


(23. b) 


Ailq  E  {  ILq  } 

(23. c) 

Ailq  -  E  {  ILq'  } 

(23. d) 

Corollary  III:  Assuming  ne  factors  within  VE  and  rit  factors  within  Vt,  the  cumulative  link 
loss  variance  can  be  decomposed  as  given  in  (24. a),  (24.b),  (24. c)  and  (24. d). 


(7  — (j  +...(7 1  +  (7  +...CT 

ILX  ILX(VE1I  ILX(V„e)  ILX(Vtl)  ILX(Vnt) 

cr"  —<J2  +...cr2  +  cr~  +...cr2 

ILVa  ILY(VE1)  ILY(Vne)  ILY(Vt1>  'LY(Vn, ) 

(J2  =<7 2  +...CT2  +CT2  +...CT2 

ilQ  ILq(VE1  )  ILQ(Vne)  ILQ(Vtl)  "-Q(Vnt> 

(J2  =a2  +...cr2  +cr2  +...cr2 

ILQ1  ILq,(VeD  ILQ  (Yne )  ILq  (Vfl)  ILQ’(Vnt, 

3.7. 2. 2  Propagating  Link  Loss  to  Link  Performance 

The  variance  and  mean  of  the  random  cumulative  loss  components  ll_x ,  IL  y ,  ILq  and  ILq> 
are  used  directly  to  determine  the  variance  on  the  perfonnance  at  the  random  link  perfonnance 

components  and  PeQ  .  The  Maximum  Likelihood  Estimate  (MLE)  of  Pe  is  inferred  at 

each  realization  of  the  sufficient  sample  support  about  ( V(: ,  VE  ,  f( )  providing  the  random  mapping 
to  perfonnance  Pe  at  each  link. 


(24.  a) 

(24. b) 

(24.  c) 

(24.  d) 
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Corollary  IV:  Given  sufficient  sampling  of  the  space  ofV£  and  Vt  within  the  finite  alphabet 
| and  Y\ ,  the  environmental  and  position  estimate  uncertainty  factors  result  in  the 

respective  random  performance  at  XandY  given  by  functions  Pe  ( vc ,  VE  ,  Vt )  and  Pe  ( vc ,  VE  , 
Vt)  as  in  equations  (25)  and  (26). 


X 

o 

Q_ 

III 

X  d) 

( vc ,  fE  ,  Vt ) 

~  F(ILx) 

(25) 

III 

"D 

CD 

(U,  VEfVt) 

*  F(ILy) 

(26) 

If  conditions  of  Corollary  IV  hold  and  perfect  training  conditions  are  assumed  where  vc'=  vc , 
VE '  =  VE ,  Vt ' =Vt ,  then  the  mapping  to  the  decision  rule  dopt  will  be  optimal. 


Corollary  V:  The  output  of  the  discrete  random  variable  Q  (from  the  finite  alphabet  |p|)  is 

driven  by  the  inferred  decision  out  of  the  application  of  each  realization  of  Y  to  dopt.  The 
random  perfonnance  function  PeQ(  vc,VE,Vt)  can  be  expressed  as  random  realization  of  the 
information  loss  in  the  channel,  lLQin  (17c).  Using  the  approximation  form  of  (13) 

(assume  I(Q;V)  «  0),  the  random  performance  function  PeQ  is  given  by  (27). 

PeQ  =  PeQ(vc,T£,^  )  ~ F { I L Q }  (27) 


The  approximation  in  (27)  can  be  replaced  by  an  equality  using  the  full  representation  in 

(4). 


PeQ  =  F{  ILq+  I(Q;V)} 


(28) 
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In  (27)  and  (28),  the  relaxation  of  the  constraint  VE '  =  VE  and  Vt'  =  Vt  expands  the  study  of  the 
effects  of  uncertainty  to  the  loss  due  to  the  non-optimal  training  of  d. 


Corollary  VI:  The  output  of  the  discrete  random  variable  Q’  (from  the  finite  alphabet  |gj) 
is  driven  by  the  inferred  decision  out  of  the  application  of  each  realization  of  Y  to  d.  The 
random  perfonnance  function  PeQ  ( vc  ,VE,Vt)  can  be  expressed  as  random  realization  of 
the  information  loss  in  the  channel,  H(H)  -  I(H;Q’).  Fixing  the  suboptimal  decision  rule  d{ 
vc'  =  pc,VE' = pE,Vt' = p)  and  using  the  form  of  (4)  and  assuming  I(Q;V’)  «  0,  the  random 
perfonnance  function  PeQ  is  given  by  (29). 

PeQ’  =  PeQ’(vc,T£,Fr)  »F{ILq’}=  F{H(H) - I(H;Q’)}  (29) 


The  approximation  in  (29)  is  replaced  by  an  equality  using  the  full  representation  in  (4). 

P eQ  =  PeQ'  (vc,vE,vt)  =  F{H(H) - I(H;Q’)  +  I(Q’;V’)}  (30) 


Definition  V:  The  expected  link  perfonnance  under  control  parameters  vc  and  in  the 
presence  of  sensing  uncertainty  (  VE  ,  Vt )  is  defined  as  the  expectation  of  the  random 


link  perfonnance  components p*,p*,p°,  and  pQ  . 

^Pex  =  E{pe><} 

(31. a) 

//PeY  =  E{peV} 

(31.b) 

MpSq  =  E  { PeQ } 

(31.c) 

^PeQ,=  E  ^PeQ'^ 

(3  l.d) 
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Given  a  sufficient  number  of  Monte-Carlo  samples  over  the  random  parameters  in  VE  and  Vt ,  the 

standard  deviation  of  the  random  link  component  perfonnance  function  is  used  as  a  measure  of 
reliability.  Reliability  is  interpreted  as  the  confidence  that  a  classification  event  would  result  in 
performance  that  would  fall  within  the  bounds  of  one  standard  deviation  of  the  mean 
performance. 

Definition  VI:  Reliability  in  predicted  link  performance  is  defined  as  the  standard 
deviation  ( a  ,  a  , ,  a  Q  ,  and  cr Q.  )  of  the  respective  random  cumulative  link  perfonnance 

pX  Pe  Pe  Pe 

associated  with  Pex ,  PeY ,  PeQ ,  and  pQ  .  The  variability  in  link  performance  is  defined  as  the 
square  of  the  reliability;  cr  ,  cr2  ,  a2  ,  and  crQ.  . 

P*  PX  P? 

3.8  Uncertainty  in  Performance 

2 

The  independent  sources  of  uncertainty  contributing  to  cr  in  (24. a)  are  individually 

,LX 

functionally  mapped  to  the  variance  on  the  random  performance  function  px  to  detennine  the 
respective  effects  on  the  reliability  of  the  predicted  link  perfonnance  estimate.  The  uncertainty 
is  passing  through  the  transcendental  relationship  between  IL*  and  Pex  .  The  nature  of  the 
nonlinear  relationship  makes  it  difficult  to  commute  the  independent  loss  variance  sources 
analytically.  It  is  important  to  relate  the  independent  sources  of  uncertainty  underlying  cr  to 

ILX 

the  corresponding  set  of  variances  that  combine  to  equal  the  variance  on  p* . 


While  this  relationship  is  transcendental  and  nonlinear,  when  the  uncertainty  is  small  and  tight 
about  the  mean,  it  is  possible  to  approximate  the  inverse  entropy  function  (F)  by  a  linear 
relationship  [25]  about  the  mean  of  IL*  [4]. 
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F(ILx)  =  a  +  b(ILx) 


The  mean  and  variance  of  the  of  the  approximation  are  then 

^/F(ILx)7  =  a  +  b  -(//IL_ ) 

Varl F(ILx)7  =  b2  •(<  ) 

ILX 

Using  established  approximation  techniques,  the  first  order  Taylor  expansion  of  F  around  the 
mean //,Lx  of  ILX  is  equal  to 

F(ILX)  «  F(//ILx)  +  F’(aLx)  (ILx-//ILx).  (32) 


Using  the  Taylor  Series  expansion  in  (32),  the  approximations  for  £’[F(ILX)]  and  Ua7’[F(ILx)]  are 
[4] 

£/F(ILx)7  =E[P*]  *  F(ALx)  =  H-1(//ILx).  (33) 

Varl F(ILX)7  =  rr2  *  {F’(//,,)}2  •  (<_  )  (34) 

pX  x  X 

and  F’(  )  can  be  shown  to  equal 


F'(«Lj  )  =  log 


1 

H"VILl) 


Assuming  ne  factors  within  VE  and  rit  factors  within  Vt,  the  cumulative  link  loss  variance 
components  given  in  (24. a)  are  applied  to  (34). 


+  ...CT2 


+  a2 


+ 


*(Vn, ) 


(35) 
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The  variance  on  the  performance  estimate  p  x  is  then  decomposed  into  the  individual  sources  of 
sensing  uncertainty  being  propagated  through  the  decision  space  at  X . 


(36) 


Similar  methods  are  applied  to  the  independent  contributions  to  the  sensing  uncertainty  of  VE 
and  Vt  comprising  the  variances  cr^  ,  cr  ,  and  cF  at  Y ,  Q,  and  Q’  respectively. 

3.8.1  Stability  of  the  Linear  Approximation 

The  validity  of  the  linear  approximation  in  (34)  requires  a2  be  small.  Thus,  the  contributing 

ILX 

sources  of  sensing  uncertainty  within  a-  must  be  individually  small.  Given  that  the  regime  of 

ILX 

interest  is  one  where  ^|L  and  thus  E[px]  are  small,  the  derivative  (slope)  evaluated  at  ^  is 

relatively  small.  The  slope  within  this  regime  is  illustrated  in  Fig.  6  for  an  arbitrary  operating 
point. 


0.8 


>.  0.6 


0 


0.5 

Entropy,  Bits 


Fig.  6.  The  inverse  Entropy  Function;  J{z)  =  H"'(w) 


The  slope  is  plotted  in  Fig.  7  for  w  e  [0,  1]. 


dw 


31 


Approved  for  public  release;  distribution  unlimited. 


3 


o 

Q. 

O 

V) 


2 

1 


0L 

0 


rS 


0.2 


0.4  0.6 

Entropy,  Bits 


0.8 


Fig.  7.  The  Derivative,  ^  H  ( w )  ,  versus  H(z) 

dw 


ATR  design  solutions  of  interest  are  typically  in  the  range  where  Pe  <  .1  .  From  Fig.  7  it  is 
evident  that  the  slope  at  an  operating  point  within  this  regime  will  be  in  the  range  [0,.25]  of  0.25 
affording  reduced  sensitivity  to  effects  of  the  size  of  a 2  . 

ILX 


3.9  Dimensionality  and  Computing 

The  computation  of  the  entropy  of  X  involves  the  joint  probability  mass  function  (PMF)  of  the 
random  multivariate  X  and  is  complicated  by  the  large  dimensional  nature  of  the  observation 
mapping  H  — >  X  .  It  is  desired  to  compute  the  discrete  entropy  for  X  absent  any  assumption 
regarding  dependence  between  the  respective  dimensions  of  X  .  If  the  X  space  consists  of  K 
random  variables  or  indices  (dependent  or  independent)  and  the  random  variable  x;  Ae{l,K}  has 

rib  distinct  bins  (statistical  divisions),  then  the  size  of  the  alphabet  of  x,|f|,  is  given  in  (37) 
below. 


i  rn:x 


(37) 


For  example,  if  K=  3  and  n/<=  2=  for  all  k,  %  =2.2.2  =  8. 
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The  j  oint  PMF  of  x ,  p(x[  ) ;  k  e  {l,  K},  j  e  {l,  nb }  is  generated  from  a  finite  N  sample  ensemble  and 

discretely  binned  with  rib  statistical  divisions  within  each  of  the  K  elements  ofX  .  Stable  entropic 
estimates  require  the  statistics  of  the  multivariate  PMF  be  sampled  sufficiently.  A  reasonable 
example  in  the  context  of  the  HRR  example  with  K=10  and  rib  =5  for  all  k  would  present  a 
theoretical  upper  bound  on  the  typical  set  of  510=9,765,625  [2],  The  typical  set  represents  the 
set  of  most  probable  events  and  contains  almost  all  of  the  probability  as  the  number  of  samples 
increases.  In  the  case  of  the  radar  example  developed  here,  this  would  be  the  set  of  most 
probable  signature  amplitude  combinations  for  all  K  dimensions  of  X  .  To  generate  a  meaningful 
sample  size  for  a  PMF  of  this  size,  we  would  need  to  produce  at  least  10  times  the  actual  typical 
set.  This  means  we  need  approximately  100  million  samples.  Thus  K  and  rib  drive  the 
dimensionality  of  X  and  subsequently  the  sampling  requirements  for  each  ensemble  within  the 
Monte-Carlo  simulation. 


A  high  dimensional  problem  is  one  where  the  alphabet  of  X  ,  |  j ,  underlying  the  random 

process  far  exceeds  the  number  of  samples  observed  (N),  i.e.;  |f|  »  N.  Sensing  systems 

typically  operate  within  this  high  dimensional  signature  data  space  of  \x\ .  The  high  dimension 

arises  due  to  factors  within  the  space  X(vc,VE,  Vt ).  Hypothesis  testing  and  inference  within  the 

high  dimensional  space  ofX  in  turn  leads  to  large  sampling  requirements  to  adequately 
detennine  the  underlying  statistical  nature  of  the  phenomenon  under  study.  Without  accurate 
detennination  of  the  underlying  system  statistics,  poorly  perfonning  hypothesis  tests  and/or 
parameter  estimation  occur  (Bias/Variance  tradeoff)  [34]. 

The  number  of  statistical  bins,  rib,  within  the  discrete  sampling  of  the  K  element  joint  PMF  of 
X  also  has  a  significant  effect  on  /\  and  thus  on  the  entropy  computation  of  X  .  An  increase  in 

size  of  rib  in  X  will  result  in  an  increase  in  the  entropy  of  X  .  However,  in  the  limit,  the  value  for 
I(H;  X )  as  a  function  of  rib  asymptotes  to  a  constant  value  -  after  one  reaches  the  full  intrinsic 
dimensionality  of  the  subspace  of  I(H;  X  )  [35].  This  will  be  true  for  (I(H;  Y  ),  I(H;Q),  and 
I(H;Q’)  as  well.  Choosing  the  most  challenging  link  in  the  channel  and  without  loss  of 
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generality,  a  method  for  detennining  the  intrinsic  dimensionality  of  X  is  then  needed  to  guide  the 
selection  of  N. 


3.9.1  Sample  Size 

The  subject  of  nonparametric  estimation  of  discrete  entropy  and  mutual  information  of  random 
variables  have  been  widely  published  [36].  The  intent  of  this  section  is  not  to  expand  this  body 
of  knowledge  but  to  record  the  approach  employed  to  determine  the  minimum  sampling 
requirements  for  the  entropy  estimation.  The  variance  parameters  of  these  estimates  is  of 
particular  interest. 


The  link  performance  variability  estimate  at  each  of  the  respective  links,  cr  ^  ,a  ,a 2  ,  and  <JpQ: 

P*  pj  p? 

are  generated  through  a  sufficient  number  of  draws  from  the  respective  random  link  performance 
functions  p  * ,  pj ,  PeQ ,  and  pQ'.  Each  draw  involves  the  estimation  of  an  entropic  quantity 

computed  from  PMF  p(x{  )  based  on  the  N  sample  ensemble  taken  from  X  .  The  estimate  of  the 

link  performance  variability  at  X  ,  <r2.  ,  is  written  more  precisely  as  in  (38)  below. 


(38) 


cr2  is  defined  as  the  N  sample  estimation  variance  or  “sampling  uncertainty”  associated  with  the 

pX 

eN 

true  variability  cr2  •  Equation  (38)  can  be  written  as 


For  the  high  dimensional  problem,  N  must  be  large  enough  for 
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39) 


p* 

The  objective  then  is  to  produce  link  reliability  estimates  that  are  within  this  regime.  The  choice 
of  N  must  be  selected  to  ensure  the  uncertainty  of  the  entropic  estimate  is  much  less  than  the 

reliability  limits  realized  due  to  various  factors  within  (vc,VE  ,Vt)  under  study.  That  is,  the 
ensemble  size  N  of  X  ,  Y ,  Q,  and  Q’  should  be  sufficiently  large  to  ensure  that  the  variance  of 
the  estimate  falls  within  three  significant  digits  of  the  variability  levels  ( a2  ,  cr  ,  cr^  ,  and 

2  -  0-2 

(7  ).  Thus  for  the  case  of  variability  at  X  we  desire  ooi  • 

<J2 

P* 

As  stated  above,  | x\  in  particular,  can  grow  to  large  levels  and  as  such  the  number  of  samples 

required  will  grow  as  well.  Given  that  the  sampling  ensemble  size  N  of  X  is  the  defining  case, 
the  following  analysis  is  focused  on  the  process  at  X  and  this  minimum  N  detennination  will  be 
imposed  also  on  Y ,  and  Q, 


3.9.2  Phase  Transitions  and  the  Typical  Set 

The  entropy  computation  [36]  requires  the  development  of  the  joint  mass  function  associated 
with  the  multi-variate  X  ?  p(xl) ;  je  {l:rib},  ke  { 1  :K} .  The  development  of  this  mass  function 

assumes  no  independence  between  the  K  indices  of  X  and  is  performed  using  a  “linked  list” 
approach  to  limit  the  memory  requirements  during  computation.  A  doubly  linked  list 
implementation  with  a  hash  table  search  approach  yields  a  computational  complexity  of  O(n)  [4]. 
The  Miller  Madow  estimate  [37]  provides  a  faster  convergence  over  the  MLE  method  [2]  for 
finite  sample  estimates. 


Maximum  Likelihood  Estimate  of  //(Xk )  ; 
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(40) 


ti,v«x¥  (xk )  “  Z  -  p(4n  )  !og2  lp(4u )} 


Miller-Madow  Estimate  of  H(Xk)  ;  (note:  M+=  number  of  statistical  bins  for  which  p(x k  )  ^  0  ) 


HMMw(Xk)=  H^(Xk)  +  {1/(2N)}  {M+-1}  (41) 

The  N  sample  estimates  for  H  Wf  (Xk )  and  HMMjv(Xk)  are  generated  from  the  joint  mass  function, 

M4);jG{1:nb},ke{l:K}. 


Phase  transitions  [38]-[42]  within  the  growth  trajectory  of  the  estimated  entropy  with 
increasing  N  are  useful  in  defining  the  alphabet  size|f| .  The  following  illustration  demonstrates 

the  usefulness  of  this  approach.  The  signature  process  under  evaluation  is  constructed  by  design 
such  that  the  actual  entropy  value  is  known.  We  model  the  multivariate  random  signature  vector 
X  to  be  unifonnly  distributed  (standard  uniform  {0,1})  with  rib=6  (all  indices  of  X  )  and  K=3. 
The  theoretical  maximum  value  of  the  entropy  of  X  is  then  log2(rib  )  or  log2(6  )=7.7549  Bits.  In 
Fig.  8  we  incrementally  generate  the  estimate  of  the  discrete  entropy  of  X  for  an  increasing 
number  of  samples.  We  plot  the  typical  set  of  X  for  each  increment.  The  typical  set  As  =  2H(  x  1 
is  computed  from  the  discrete  entropy  H(  X  ).  Each  of  the  estimated  values  for  the  typical  set  of 
X  asymptote  at  the  maximum  dimensionality  of  X  where  the  theoretical  values  of  H(  X  )  = 
7.7549  Bits  and  A£  =  216. 


36 


Approved  for  public  release;  distribution  unlimited. 


Fig.  8.  Phase  Transitions  in  X  and  Computing  the  Minimum  Sampling  NM,  MLE  Method 


Initially  the  samples  are  filling  the  open  high  dimensional  space  of  X  in  a  unifonn  fashion. 
The  linear  dashed  line  represents  the  log2(N)  growth  of  the  entropy  associated  with  this  uniform 
distribution.  Note  that  the  actual  achieved  entropy  computation  begins  to  diverge  from  a 
uniform  distribution.  Only  after  the  samples  of  X  begin  to  accumulate  in  the  bin  space  of  the 
joint  mass  function  of  X  does  this  transition  occur.  This  phase  transition  point  represents  the 
point  at  which  “collisions”  occur  and  the  fundamental  statistics  of  X  change. 


The  phase  transition  point  is  detennined  from  intersection  of  the  line  tangent  to  the  linear 
portion  of  the  typical  set  profile  and  the  line  tangent  to  the  asymptotic  portion  of  the  profile.  The 
number  of  samples  coinciding  with  this  phase  transition  point  is  Nj.  For  the  example  here,  Nj  is 
found  to  be  approximately  250  as  illustrated  in  Fig.  8.  The  minimum  number  of  samples,  Nm,  is 
taken  to  be  100  times  the  value  of  Nj.  In  this  example  Nm  is  found  to  be  25,000.  The  Miller- 
Madow  estimate  for  entropy  H MM  (Xk)  is  used  for  all  entropic  computation  within  the  remaining 

body  of  this  analysis. 
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3.10  Sampling  Uncertainty  for  Probability  of  Error  Estimate 


Since  the  random  estimation  error  variable  is  essentially  the  sum  of  many  independently 
distributed  random  variables,  the  estimation  error  is  Gaussian.  The  standard  deviation  of  the 
Gaussian  distribution  of  1(H;X) ,  will  then  scale  as  a  function  of  .  Thus  the  variance  on  the 

estimate  i(H ;  X) ,  a-2  ,  can  be  scaled  to  large  sample  size  (a2  ).  The  standard  deviation  of 

i(H;x)2NT  i(H;x)fj 

the  estimate  Pex  can  be  determined  from  the  independent  contributions  of  H(H)  and  I(H ;  X) 
shown  in  (42). 


Pex  -  H’^HCH)-  I(H;X)) 


(42) 


For  the  equal  probable  binary  hypothesis  case,  H(H)  is  equal  to  1  Bit. 
uncertainty  is  a  function  only  of  a2 


eN 


Therefore  the  sampling 


As  in  section  III.H,  the  inverse  entropy  function  in  (42)  is  a  transcendental  function  and  as  such 
the  variance  on  the  estimate  Pex ,  a2  ,  can  be  very  difficult  to  detennine  analytically.  Following 

pX 

eN 

a  similar  line  of  analysis  as  in  section  III.H  using  (33)  and  (34),  the  mean  and  variance  of  Pex  can 
be  calculated  as 


E[  ) 

l(H;x)2NT 


(43) 


{/(a*.  )Y 

I(H;x)2nt 


(44) 


The  use  of  (44)  requires  an  estimate  of  the  mean  of  i(H;X)  which  is  taken  to  be  the  sample  mean 


.  The  ultimate  goal  is  to  leam  the  sampling  uncertainty  for  Pe 


’  a  ’ 


from  a  low  sample 
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estimate  of  the  mean  of  i(H;X) ,  u  .  Manipulating  (44)  above,  a2  can  be  written  in  terms 

i(H;x)2NT  l(H;x)N 

of  the  required  variance  on  the  estimate  of  error,  o-2  ; 

px 


G  <a 

I(H;x)n  ~U  - 


•log 


(45) 


To  ensure  cr  <  a  ,  the  relationship  in  (45)  is  essential. 

pe  k,  pe  „  ... 


The  regime  of  interest  is  where  I(H;X)is  close  to  1  and  H'^l-  M.  )  and  thus  P  x  is  small. 

i(H;x)2NT 

The  derivative  of  the  estimate  in  this  regime  is  in  the  range  of  [0,  .25]  as  illustrated  in  Fig.  7.  A 
slope  of  less  than  .25  is  small  relative  to  the  range  of  values  given  in  Fig.  7  yet  large  with  respect 

to  A  .  Therefore,  errors  in  the  estimate  of  A  can  have  a  significant  impact  on  the 

I  (H;x)2Nt  I  ( H :  X ) 

estimate  of  the  number  of  samples  required  to  reach  a  target  sampling  uncertainty  of  a2  •  This 

p  x 

PeReq 

means  that  a  conservative  approach  is  needed  to  estimate  E[i(H;X)]  based  on  a  small  number  of 
samples.  Instead  of  using  the  sample  mean  ju  as  an  estimate  of  the  expectation  E{  i(H;X) 

l(H;x)2NT 

},  a  value  somewhat  less  than  the  sample  mean  should  be  chosen.  Depending  on  the  level  of 
confidence  required  in  the  estimate  of  the  number  of  samples  N,  a  higher  confidence  estimate 

can  be  achieved  by  replacing  // .  with//  -cr  in  (45). 

i(H;x)2Nx  i(H;x)2NT  ifH;x)2NT 


As  discussed  above,  the  variance  on  the  estimate  i(H;X) ,  cr^  ,  can  be  scaled  to  large 
sample  size  (a2  ).  The  mean  of  the  estimate  of  i(H-X) ,  V  ,  and  the  standard  deviation 

i(H;x)N  ’  l(H;  X)  2N  T 

,  cr  ,  can  be  estimated  using  the  low  number  of  samples  (N=2Nj). 

I(H;x)2Nt 
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3.11  Sampling  Uncertainty  versus  Variability  in  Performance 


The  expression  in  (45)  provides  guidance  on  the  level  of  sampling  uncertainty  associated  with 
1(1-1  ;X)  required  to  achieve  the  corresponding  sampling  uncertainty  in  Pex  .  A  more  important 
question  relevant  to  the  study  of  uncertainty  and  performance  estimation  is  the  relationship 
introduced  in  (44)  and  written  in  general  form  below. 


— A<« 

cr 


pj* 


(46) 


The  variable  a  can  be  set  to  limit  the  degree  of  sampling  uncertainty  to  be  realized  in  the 
perfonnance  confidence  analysis.  Using  (44),  previous  development,  and  the  fact  that  al_  = 

erf,  o. ;  (46)  can  be  written  as  in  (47). 


<J~ 


2 

°I(H;X) 


J3(  N,Nt) 


(47) 


The  factor  /?(N,NT)  in  (47)  is  given  as 


AN,nt) 


log 

log 

nf-'Vx.J 

Thus  the  expression  in  (48)  can  be  used  to  test  for  conditions  specified  in  (46). 


2 

°I(H;X) 


•  /?( N,Nt)  <  a 


(48) 
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The  FBIT  model  provides  a  platform  for  the  study  and  analysis  of  the  relationship  of  the  level 
of  sampling  uncertainty  to  the  level  of  perfonnance  uncertainty.  Incremental  values  for  the  ratio 
on  the  left  side  of  (48)  can  be  computed  for  increasing  N.  In  section  IV,  the  point  at  which  the 
inequality  is  obeyed  is  related  to  the  phase  transition  minimum  sample  methods  generated  in 
section  III. I. 
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4  AN  INFORMATION  FLOW  NUMERICAL  EXAMPLE 


The  application  of  the  FBIT  method  to  the  study  of  uncertainty  propagation  is  now  illustrated 
within  a  simple  radar  sensor  example.  An  infonnation  loss  budget  is  constructed  for  a  baseline 
design.  Selected  fonns  of  uncertainty  in  Table  I  are  introduced  into  the  system  to  demonstrate 
the  analysis  of  the  effects  of  propagating  uncertainty  through  the  infonnation  sensing  channel. 


4.20bserved  Target  Scattering  Model 

In  the  high  frequency  regime  used  to  obtain  HRR  signatures,  the  target  may  be  approximated 
as  a  collection  of  scattering  centers  valid  over  a  limited  aspect  window  and  frequency  band. 
These  scattering  centers  may  be  considered  to  be  localized  to  a  point  and  may  represent  a  variety 
of  scattering  phenomena  ranging  from  specular  reflection  to  diffraction  phenomena  such  as  edge 
and  tip  diffraction.  The  fields  radiated  by  these  point  scatterers  depend  upon  both  temporal  and 
spatial  frequencies  (angular  dependence).  Since  the  radar  illuminating  the  target  has  finite 
bandwidth  and  is  a  one  dimensional  imaging  system,  the  target  is  seen  as  a  collection  of 
contiguous  swaths  of  range,  with  each  range  swath  corresponding  to  a  particular  range.  The 
extent  of  each  range  swath  (range  resolution)  depends  upon  the  signal  bandwidth.  For  a  typical 
extended  target  of  interest,  each  range  swath  contains  a  number  of  scattering  centers  which  can 
be  widely  spaced  in  cross-range  [17]. 


The  electromagnetic  field  obtained  as  a  result  of  the  interference  of  the  scattered  fields  from 
the  scattering  centers  appears  as  the  signal  corresponding  to  a  particular  range  bin  of  the  target 
signature.  The  target  signature  may  be  considered  to  be  a  one  dimensional  image  of  the 
reflectivity  (or  scattering)  profile  of  the  target  for  a  given  azimuth/elevation  aspect  angle  (&,</>) 
and  bandwidth.  The  mathematical  definition  of  the  radar  signature  is  developed  from  the 
normalized  scattered  field  in  (49).  Es  and  E'  are  the  scattered  field  and  the  incident  field 
respectively. 
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S(9,</))=  lim  4 ttR2^ 


R->  QO  Ig* 


(49) 


Using  scattering  center  modeling  and  the  far  field  approximation,  (49)  can  be  written  in  terms  of 
the  target  aspect  angle  and  the  transmitted  wavelength  as  shown  in  (50)  [43]. 


M 


(50) 


S  e(6>,$M)  =  XV<U 


In  equation  (50)  sE  is  the  band-limited  frequency  response  of  the  target  comprised  of  M 
scattering  centers  at  the  respective  range  Rm.  Conditioned  on  the  target  hypothesis  H  at  a  fixed 


aspect  angle  (0,.,$.),  SE(0,.,^)  =  SE(0,.,$,2)  ;Ae  {4„4;+1,...4f}  defines  the  band-limited  frequency 


response  of  the  normalized  scattered  field  measurements  given  in  (50).  Clusters  of  simple 
scattering  centers  are  chosen  for  targets  of  interest  at  X-band  frequencies  (8-12  GHz)  in  the 
following  development.  The  targets  are  electrically  large  with  dimensions  in  range  and  cross¬ 
range  of  many  wavelengths. 

The  target  cluster  of  M  isotropic  scatters  occupies  the  target  volume  within  the  coordinate 
system  illustrated  in  Fig.  9. 


Radar 

Platform  Z 


) 


Fig.  9.  Radar  Sensor  Coordinate  System 
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The  3  dimensional  target  scattering  center  configuration  for  the  two  targets  examined  in  the 
following  example  occupy  an  approximate  cubic  volume  of  {x=2,  y=3,  z=2.5}  meters  and  are 

positioned  at  a  line-of-site,  /  os  ,of  for  (<?„$)  =  10° ,7.5°,  Both  targets  are  comprised  of  100 
scattering  centers  of  unity  amplitude  and  three  strong  localized  scattering  clusters  of  amplitude  5. 
Target  1  differs  from  target  2  in  that  the  length  of  the  target  1  is  shorter  than  target  2  in  the  Y 
dimension  by  .5  meters.  One  of  the  localized  scattering  clusters  is  also  displaced  by  (.2,  .2,  0) 
meters. 


4.3  Radar  Sensor  Model 

Applying  matched  filter  processing  and  the  discrete  Fourier  transform  to  the  observed 
signature  SE(0,.,$)  in  additive  noise,  the  measured  HRR  signature  can  be  modeled  for  a  range  of 

frequencies  present  in  the  transmitted  wavefonn.  The  multidimensional  encoded  source  XE  is 
defined  here  as  the  vector  form  of  the  time  delay  transfonnation  of  the  band-limited  frequency 
response  SE(0,.,$).  The  measured  random  signature  process is  then  defined  as  in  equation 
(51)  were  n  is  additive  white  noise  [17]. 

K  =  K  +  "  (51) 

The  process  Xj,  is  modeled  at  the  output  of  a  radar  step  frequency  measurement  sensor  system 
for  the  specified  target  aspect  angle  (#,  ,^,  ).  The  additive  noise  process  n  is  modeled  as  the  sum 
of  thennal  white  noise  and  quantization  noise  components.  The  quantization  error  component  is 
thought  of  as  a  random  process  uncorrelated  with  both  the  signal  and  the  thermal  noise.  The 
complete  radar  step  frequency  measurement  model  system  parameters  are  summarized  below  in 
Table  III. 


Table  III.  Sensor  Summary 


Center  frequency 

9.6  GHz 

Transmit  Bandwidth 

800  MHz 

Number  Bits  in  A/D 

8  Bits 

Conversion 

Number  of  Pulses  Integrated 

1024 

Signal-to-Noise  Ratio 

20dB 

(time  delay  domain) 

(variable) 
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The  sensing  of  X|^  in  a  dynamic  real  world  environment  is  subject  to  the  uncertainties  listed  in 
area  1  of  Table  I  leading  to  the  random  signature  process  X  as  outlined  in  section  III.E  and 
summarized  in  Table  II.  Given  the  dynamic  nature  of  the  phenomenon  underlying  these 
uncertainties,  the  statistics  associated  with  the  dimensions  of  X  are  often  time  varying.  The 
target  statistics  are  assumed  to  be  stationary  (constant  with  time),  thus  the  sample  signatures 
associated  with  this  random  vector  correspond  to  a  stationary  random  process.  Given  the  short 
measurement  times  associated  with  radar  measurements  of  the  nature  under  study,  this 
assumption  is  judged  as  appropriate. 

4.3.1  Modeling  Pose  Angle  Estimation  Uncertainty 

The  observed  object  aspect  angle  estimate  can  be  viewed  as  lying  within  a  solid  cone  angle 
centered  on  the  observed  object  aspect  angle  (6>,^,).  The  parameter  <rt  is  defined  as  the 

uncertainty  associated  with  the  sensor  estimate  of  (0t,</>t).  The  parameter  at  and  //,  are  elements 
of  Vt  and  are  the  standard  deviation  and  bias  of  the  object  aspect  angle  estimate  respectively. 


The  variation  in  measured  signature  phenomenology  due  to  the  uncertainties  in  target  aspect 
angle  are  generated  in  the  signal  model  in  (50)  through  the  introduction  of  distributions  on  9  and 
(p .  The  parameters  9  and  </>  are  both  modeled  as  Gaussian  random  variables  each  with  variance 

a] 1  and  mean  //,  +9t  ,  //,  +  (pt .  The  bias  parameter  //,  is  assumed  to  be  unknown  and  is  modeled 
uniformly  distributed  between  the  interval  [-1,  1]  degrees. 


4.3.2  Modeling  Leading  Edge  Position  Estimation  Uncertainty 

The  target  leading  edge  location  estimation  will  vary  under  real  world  sensing  conditions. 

Thus  the  range  alignment  (along  the /as  )  of  the  measured  signature  process  Xto  the  decision  rule 
training  process  X'  is  imperfect  and  can  be  modeled  as  an  uncertainty  source.  The  process  X 
alignment  to  X'  is  modeled  through  a  positive  bias  applied  to  the  phase  center  of  the  scattering 
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cluster  underlying  X .  The  bias  parameter  jur  is  assumed  to  be  unknown  and  is  modeled  uniformly 
distributed  between  [0,  .2]  meters.  Note  that  /ur  is  another  element  of  Vt . 


4.3.3  Modeling  Imperfect  Training 

The  training  process  component  X'  in  Fig.  2  represents  the  best  achievable  statistical 
characterization  of  the  observed  signature  process  X  .  Signature  training  processes  must 
represent  the  radar  measured  signature  process  across  a  wide  range  of  measurement  uncertainties 
and  target  configurations  as  well  as  under  many  uncertain  operating  conditions  including  clutter, 
obscuration,  and  other  sources  of  RF  interference.  Construction  of  a  signature  training  database 
derived  entirely  from  measurements  is  expensive  and  can  be  an  impractical  proposition.  It  is 
possible  to  construct  a  signature  database  using  electromagnetic  scattering  codes.  However, 
given  the  complexity  of  typical  targets  and  the  challenge  of  modeling  a  variety  of 
electromagnetic  scattering  phenomena  ranging  from  specular  reflection  to  edge  diffraction, 
smooth  surface  diffraction  etc.,  computation  of  signatures  with  sufficient  accuracy  is  also  a 
challenging  task  [17].  Within  this  analysis  the  dissimilarity  between  X  with  X'  will  be  generated 
using  a  matched  scattering  center  model  configuration  withx .  The  uncertain  parameters  of  Vt 

and  VE  modeled  within  X  are  not  modeled  inX' .  X'  =  X  only  whenx  is  used  directly  for  the 
training  of  the  decision  rule  d. 


4.4 Feature  Discrimant  and  Decision  Rule  Design 

The  function / used  to  compute  the  feature  discriminate  Y  from  X  in  Fig.  2  is  developed  from 
the  squared  error  of  the  distance  from  the  mean  templates  juv  and  ju  -  derived  from  the  marginal 

Xj  X2 

training  processes  x1,  and  x\  as  defined  below  [33].  The  operator  |f  |  is  defined  as  the  element¬ 
wise  magnitude  of  each  complex  element  of  the  random  vector  z  . 

//x,  =  E  {Ix’J},  M*  =  E  {Ix'J}, 
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The  Maximum  Likelihood  estimator  is  used  to  determine  the  optimal  decision  rule  d. 


d=  E 


Assuming  equally  likely  priors  on  each  of  the  binary  hypotheses  Hi  and  H2  in  XandY,  the 
samples  ( Y  )  from  Y  are  applied  to  the  decision  rule  d  .  Y  <  d  are  declared  from  H 1  (denoted 
Qi)  and  Y>  d  are  declared  from  Hi  (denoted  Qi).  The  in-class  and  out-of-class  scoring  system 
is  given  by  the  conditional  probabilities  within  a,/?,y,andx*as  provided  below. 


p-^{yx) 

"**’>' '(%) 

The  output  of  the  decision  algorithm  Q  as  formed  from  the  scoring  system  above  can  be 
summarized  by  the  confusion  matrix  for  the  binary  classifier  given  in  Fig.  10  below. 


Test  Class/ 

X’. 

X‘a 

Train  Class 

x, 

a 

P 

x2 

r 

K 

Fig.  10.  Confusion  Matrix  for  Q 
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4.5 Certainty  States 


The  “most  certain”  state  achievable  for  the  example  HRR  radar  example  presented  here  is  the 
case  of  the  observed  deterministic  multivariate  signal  in  noise  (X„)  when  accompanied  by  perfect 
training  (X'  =  ).  Assuming  sufficient  sampling  to  completely  determine  the  pdf  associated 

with  the  additive  noise,  the  resulting  statistical  characteristics  of  the  random  perfonnance 
functions  will  resemble  the  delta  function  [44],  and  thus  the  reliability  in  predicted  link 
perfonnance  (such  as  a  )  will  be  very  high  as  shown  in  case  1  of  Fig.  10.  In  a  less  certain  case, 


the  signal  under  measurement  is  random  in  nature  ( X  ).  The  expected  performance  of  the 
random  performance  functions  will  reflect  the  loss  in  information  due  to  the  degree  of 
uncertainty  present  inX  as  well  as  a  decrease  in  reliability.  Given  the  progressively  large 
number  of  degrees  of  freedom  associated  with  the  uncertainty  parameters  associated  with  VE  and 

Vt  inX ,  the  statistical  support  underlying  the  statistics  of  the  random  link  performance  functions 
pX;pv  pQ(  and  pQ'  can  quickly  increase  as  is  shown  in  case  3-5  below  in  Fig.  11. 


Case  1:  Signal  in  Noise 


Q 


Case  3-5:  Random  Noiset  film* 

Signal  in  Noise  Random 


Fig.  11.  Propagation  of  Uncertainty  Illustration 


Table  IV  below  relates  selected  combinations  of  measurement  and  training  uncertainty  sources 
from  Table  1.  The  cases  1-6  identified  in  Table  IV  represent  the  certainty  states  of  interest 
within  the  system.  Case  1  of  Table  IV  represents  an  observed  process  X  of  a  stationary  object  of 

known  aspect  angle  with  perfect  training.  Case  1  conditions  correspond  to  the  highest  certainty 
state  possible.  Case  2  corresponds  to  the  observed  process  X  of  an  object  that  is  moving  slow 
enough  as  to  appear  stationary  during  the  measurement  interval.  The  aspect  estimation  is  crt  =75 
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degrees  with  an  unknown  bias  (//, )  and  again  the  training  is  perfect.  Case  3  conditions  are 
similar  with  an  unknown  leading  edge  position  bias  /Jr . 


The  parameter  SNR  is  treated  as  an  unknown  parameter  in  Case  4.  Case  5  is  a  combined 
condition  of  the  unknown  parameters  in  Case  2,  3,  and  4.  In  case  6,  a  form  of  imperfect  training 
is  presented  where  the  measurement  parameter  uncertainty  provided  in  Case  5  is  combined  with 
training  level  B  ( //,.  =  0  and  //,  =  0). 


4.6Sampling  and  FBIT  Analysis 

4.6.1  Signature  Ensembles 

The  amplitude  response  for  the  N  sample  ensemble  of  HRR  signatures  for  a  “baseline”  set  of 
conditions  defined  as  Case  2  (jur  =  0  and  //,  =  0)  are  provided  in  Fig.  12. a  and  Fig.  12.b. 


Range  Bin  Number  Range  Bin  Number 

a.  Magnitude  of  X1  b.  Magnitude  of  X. 

Fig.  12.  FIRR  Signature  Amplitude  Ensemble  (No  Noise);  N=103 
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Table  IV 


Measurement  and  Training  Certainty  Cases 


Case  Number 

Training  Level 

Measurement 

Level 

X  =  x„ 

X„ 

Case  1 

<7,  =  0\fl,  =0° 

cr,  =  0° ,  //,  =  0° 

<7  r  =  0  m,  nr  =  0° 

^q 

ii 

o 

3 

II 

o 

(0 ,,<(,)  =  10°  ,7.5° 

(d?,  ,</>,)  =  10°  ,7.5° 

57V/?=20dB 

S7VK=20dB 

X'  =  X 

X 

Case  2 

cr,  =  .75‘,//, 

cr,  =  .75°,//, 

ii 

o 

3 

5= 

ii 

o 

cr,  =  0»(,  //,  =0° 

(0,,^,)  =  10°  ,7.5° 

(0,,$)  =  10°  ,7.5° 

S7VR=20dB 

S7VK=20dB 

X  =  X 

X 

Case  3 

cr,  =  .75",//,  =0° 

cr,  =  .75°,//,  =  0° 

„q 

ii 

o 

3 

T5 

.q 

ii 

o 

3 

Ta 

=  10°  ,7.5° 

(0,,^,)  =  10°  ,7.5° 

S7VK=20dB 

S7VK=20dB 

X  =  X 

X 

Case  4 

cr,  =  .75°,//,  =0° 

q 

ii 

5= 

II 

O 

.q 

ii 

o 

3 

II 

o 

cr,  =  0//(,  //,  =0° 

{$,,(/,)  =  10°  ,7.5° 

(0,4)  =  10°  ,7.5° 

SNR 

SNR 

X  =  X 

X 

Case  5 

(Tt  =  .15°  ,flt 

cr,  =  .75°,//, 

Cl 

II 

o 

3 

T5 

cr,  =  0/M,  //, 

(0,,  $>,)  =  10°  ,7.5° 

(0,4)  =  10°  ,7.5° 

SNR 

SNR 

X'  *  X 

X 

cr,  =  .75°,//, 
cr,  =  0/H,  //,  =0° 

cr,  =  .75°,//, 
cr,  =  0/M,  //, 

Case  6 

(0,4)  =  10°  ,7.5° 

(0,,^,)  =  10°, 7.5° 
,S',Vff=20dB 

SNR 

*note:  Parameter  fnt  modeled  uniform  [  —  \°  ^\°  ],  Parameter  fi^  modeled  uniform 


[  0°  ,.2°  ]»  Parameter  SNR  modeled  Gaussian  (  fi  =  20  dB,  <j 2  =  4  dB) 
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The  five  target  features  (K=5)  at  range  bins  17-21  are  selected  for  discriminate  processing  in  X 

->  Y. 

4.6.2  Sampling  Uncertainty  Example 

The  sampling  uncertainty  defined  in  Section  II.  J  is  illustrated  using  the  baseline  uncertainty 
conditions  and  multiple  target  ensembles  similar  to  those  given  above  in  Section  III. 1. 2.  Using 
the  Monte-Carlo  simulation,  the  typical  sets  for  X, ,  X, ,  and  X ,  are  computed  for  an  increasing 
value  for  N.  Multiple  ensembles  of  each  are  simulated  at  each  value  of  N  to  generate  both  the 
mean  and  variance  of  the  entropy  estimate  within  the  typical  set. 


The  typical  set  plot  in  Fig.  13  provides  the  value  for  Nm  for  the  entropy  estimates  forXas 
defined  in  Section  III. 1.2.  The  number  of  samples  required  for  each  ensemble  based  on  the  phase 
transition  at  Nt=2x103  within  the  typical  set  profile  is  determined  to  be  Nm=2x105. 


105r 


Fig.  13.  Phase  Transition  Within  Typical  Set  of  X  Versus  N;  nb=6 
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Figure  14  demonstrates  the  entropy  scaling  property  discussed  in  section  III.J.  In  the  following 
example,  Monte-Carlo  simulation  is  used  to  compute  the  actual  estimation  variance  (L 
draws=1000)  at  each  incremental  setting  of  Nm-  The  estimation  variance  at  Nt=3x103  is  scaled 
to  each  setting  of  Nm  to  a  maximum  value  of  Nm=2x105  validating  the  use  of  the  1/N  scaling 
factor  in  (47). 


Fig.  14.  Scaled  Standard  Deviation  of  Estimator  of  Entropy  of  X  Versus  N;  nb=6,  L=1000 


The  sampling  uncertainty  associated  with  entropic  estimation  at  X  is  realized  within  the 
estimate  i(H;X)  •  Figure  15  applies  the  1/N  scaling  directly  to  the  MLE  estimate  ofi(H;X) ,  i(H;Y) 
,  and  i(H;Q) beginning  at  2xNj=  6xl03. 
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Fig.  15.  Scaling  Properties  of  i(H;X),i(H;Y),i(H;Q)  Versus  Ensemble  Size  N,  N  t=3x103,  nb=6,  L=1000 


In  Equation  (47),  Corollary  IV  and  V  are  used  to  compute  the  sampling  uncertainty  associated 
with  the  estimate  of  the  probability  of  error.  The  following  figures  demonstrate  the  accuracy  of 
using  (44).  Equation  (44)  is  applied  at  each  link  in  the  radar  channel.  Note  that  each  application 
of  (44)  is  conducted  with  the  2xNj=6xl03  as  the  basis  for  the  scaling.  The  approximation  for  the 
standard  deviation  of  the  probability  of  error  is  computed  for  the  complete  range  of  ensemble 
size  out  to  N=3xl04.  Figure  16  provides  a  comparison  of  the  probability  of  error  estimate  using 
(44)  to  the  error  computed  using  simulation.  These  results  show  that  the  estimates  compare  very 
nicely  to  the  “actual”  results.  This  agreement  indicates  that  the  dispersion  of  the  mean  mutual 
information  of  the  estimate  is  low  enough  to  support  the  use  of  the  linear  approximation. 
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Fig.  16.  <7  .  ,<T  _  ,  and  <7  Versus  the  Linear  Approximation,  N  t=3x103  ,  nb=6,  L=1000 

pX  pY  pQ 

BN  bn  s-N 


The  application  of  (44)  at  each  draw  of  the  Monte-Carlo  simulation  will  generate  an  estimate  of 
the  sampling  uncertainty  associated  with  the  probability  of  error  estimate.  Figure  17  illustrates 
the  application  of  (44)  to  the  results  in  Fig.  15. 


log  of  Number  of  Samples 


Fig.  17.  Scaling  Properties  of  Pex,PeY,PeQ  Versus  N  ,  NT=3xl03,  L=1000 
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Equation  (48)  provides  the  test  for  minimum  sampling  based  on  low  sample  ensemble  sizes. 

In  Fig.  18,  (48)  is  applied  to  the  radar  example  at  the  three  link  positions  X ,  Y  and  Q.  The  test 
results  in  Fig.  18  show  that  the  true  ratio  of  sampling  variance  to  the  variability  in  predicted  link 
performance  is  given  as  a  function  of  ensemble  size  N.  This  is  indicated  by  the  solid  lines.  The 
dashed  lines  represent  the  ratio  as  given  by  the  1/N  scaling  as  discussed  above.  The  required 
ratio  a  is  given  by  the  dashed  black  line  at  two  different  levels.  The  results  of  the  test  given  in 
(48)  are  given  at  each  increment  for  Nj  =3x1 03.  The  interesting  observation  in  Fig.  18  is  that 
the  point  at  which  the  test  falls  below  the  threshold  a  is  consistent  with  the  ensemble  size  Nm  as 
derived  from  the  phase  transition  point  Nj  as  outlined  in  section  III.K.  This  is  a  significant 
validation  of  the  use  of  the  phase  transition  method  for  estimating  minimum  ensemble  size 
within  Monte-Carlo  simulation.  The  results  of  the  three  tests  above  provide  insight  into  the 
relationship  of  the  required  ensemble  size  N  to  the  reliability  in  link  performance  estimates 
within  sensitivity  analysis  simulations. 
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Fig.  18.  Sampling  Uncertainty  and  Testing  Nm  at  X  Y,  Q 
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4.6.3  The  Fano  Equality 


It  is  important  to  demonstrate  the  validity  of  Theorem  I  as  written  in  (2).  Using  the  radar 
example,  Fig.  19  illustrates  that  the  addition  of  I(Q;V)  brings  the  approximation  fonn  of  Fano 
into  agreement  with  the  “true”  probability  of  error  as  simulated  using  Monte-Carlo  within  the 
radar  example  outlined  above.  Again  using  Case  2  (//,.  =  0  and  //,  =  0)  conditions  for  the  binary 

classification,  the  performance  given  by  the  Fano  approximation  is  given  by  the  red  line.  The 
Simulated  “true”  (actual)  performance  is  given  by  the  black  line.  The  green  line  represents  the 
performance  using  the  equality  (exact)  fonn  of  Fano  in  Theorem  I.  The  equality  form  of  Fano 
agrees  with  the  “true”  perfonnance  which  validates  Theorem  I. 
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Fig. 19.  Actual  Mean  Probability  of  Error  at  Q  Versus  Mean  Fano  Estimated  Performance  at  Q, 

nb=6,  L=1000 
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5  EXPERIMENTS 


Several  hypotheses  directly  highlight  the  potential  advantages  of  the  FBIT  method  in  the 
performance  characterization  of  an  infonnation  sensing  system  in  the  presence  of  various 
uncertainties. 


5.2  Research  Hypotheses 

Hypothesis  1 :  Information  flow  through  the  components  of  a  sensing  system  can  be  studied  and 
infonnation  bottlenecks  can  be  identified.  System  performance  upper  bounds  can  be  characterized 
based  on  the  loss  in  information  attributed  to  each  component. 

Hypothesis  2\  For  a  fixed  H(H),  maximizing  I(H,Q)  will  minimize  the  equivocation  H(H/Q)  and 
thus  minimize  the  probability  of  error  Pe.  Thus;  system  component  design  parameters  ( y  )  can  be 
traded  directly  with  loss  in  the  channel  to  minimize  Pe  - 

Hypothesis  3:  Sources  of  uncertainty  can  be  characterized  in  terms  of  their  effects  within  the 
decision  rule  subspaces  and  their  relative  impact  to  losses  within  several  subsystem  components  of 
the  radar  system. 

Hypothesis  4:  The  increasing  dimensionality  of  X  will  eventually  lead  to  an  unacceptable 
degradation  to  the  reliability  in  predicted  link  performance. 

Hypothesis  5:  Selected  sources  of  uncertainty  within  the  radar  information  channel  can  be  ranked 
as  to  their  relative  impact  to  the  perfomiance  of  the  infomiation  exploitation  system. 


5.3  Experiments 

The  experiments  conducted  to  address  the  hypotheses  above  are  given  in  Table  V. 
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Table  V. 


List  of  Experiments  and  Applicable  Cases 


Experiment 

Case 

Hypothesis 

1 .  Information  Flow 

2 

1 

2.  System  Trades 

2 

2 

3  System 

Uncertainty  and 

Information  Flow 

1,2, 3, 4, 5, 6 

3,4,5 

5.3.1  Information  Flow  and  Design  Trades  within  the  Radar  Channel 

The  value  of  the  Data  Processing  Inequality  is  readily  seen  from  Fig.  20-22  below  where  the 
individual  loss  at  each  link  in  the  channel  can  be  quantified.  In  each  of  the  figures,  the  MI  and 
probability  of  error  is  computed  for  a  changing  design  parameter  within  vc .  Three  design 
parameters  are  traded;  system  thermal  noise,  system  dynamic  range,  and  system  bandwidth. 


The  signal-to-noise  ratio  of  the  signatures  resulting  from  sensor  measurements  depend  in  part 
on  the  noise  figure  of  the  system.  In  Fig.  20  thermal  noise  is  scaled  by  varying  the  noise  figure 
across  a  range  that  affects  a  SNR  range  of  1  dB  to  10  dB  (SNR  is  given  in  frequency  domain 
prior  to  inverse  Fourier  Transform  gain).  The  results  of  the  SNR  trade  indicate  that  an  SNR  of  8 
dB  in  the  frequency  domain  (19  dB  in  the  time-delay  domain  after  transform  gain)  will  generate 
maximum  information  flow. 
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Fig.20.  System  Thermal  Noise  Trade,  SNR  in  Frequency  Domain 


It  is  also  of  interest  how  the  dynamic  range  of  the  sensor  affects  the  information  flow  through 
the  channel.  Specifically,  the  sensitivity  of  I(H;Q)  and  ultimately  Pe  to  the  dynamic  range  in  the 
sensor  is  of  interest.  The  A/D  conversion  of  the  radar  intennediate  frequency  (IF)  signal  to  a 
digital  representation  must  preserve  the  amplitude  and  phase  information  contained  in  the  radar 
return  with  minimum  error.  The  effects  of  quantization  at  each  measurement  point  (quantization 
event)  due  to  the  twos-complement  rounding  error  are  assumed  to  be  zero  mean  white  noise 
processes  [45].  The  A/D  conversion  and  associated  quantization  noise  are  modeled  as  an 
additive  noise  component  e  and  added  to  the  measured  signature  process  [46]. 


X[-X[  +  n+  e  (52) 

The  maximum  dynamic  range  supportable  by  a  “B-bit”  quantizer  is  the  ratio  of  the  largest 
representable  magnitude  to  the  smallest  nonzero  representable  magnitude  [47].  The  dynamic 
range  for  twos  compliment  and  magnitude  encoding  for  a  “B-bit”  quantizer  is  [48] 
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Dynamic  Range  (dB)  =  ao-iogJ2^1 


The  trade  in  Fig.  2 1  indicates  that  a  3  or  4  Bit  A/D  converter  is  needed  to  maximize  infonnation 
flow  in  the  channel  given  the  binary  target  set  under  evaluation. 
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Fig.  21.  System  Dynamic  Range  Trade 


The  analysis  of  the  bandwidth  trade  in  Fig.  22  can  be  nicely  linked  to  the  physical  scattering 
configurations  of  target  1  and  target  2.  As  was  mentioned  earlier  in  the  report,  the  locations  for 
the  non-collocated  dominant  scatterer  differ  by  .2  meters  or  .65  feet. 

One  would  then  expect  that  there  should  be  a  ‘bump’  in  information  flow  when  the  bandwidth 
reaches  levels  that  support  the  resolution  necessary  to  resolve  the  peaks  associated  with  these 
two  scatterers.  The  theoretical  resolution  to  achieve  this  feature  separation  would  be 
approximately  800  MHz  using  the  fundamental  bandwidth  relationship; 
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In  Fig.  22  the  bump  in  performance  is  centered  at  800  MHz  where  the  mutual  infonnation  atY 
and  Q  is  rapidly  increasing  and  where  the  probability  of  error  is  greatly  reduced. 
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Fig.  22.  System  Bandwidth  Trade 


In  each  figure  it  can  be  seen  that  the  MI  decreases  as  links  move  further  down  the  channel. 

With  one  Bit  going  into  the  channel  (binary  classification  problem),  Table  VI  below  tabulates  the 
information  loss  budget  for  each  trade  study  at  the  selected  baseline  operating  point. 


The  study  of  Table  VI  reveals  several  key  points.  First,  In  this  particular  example  problem,  the 
targets  appear  to  be  separating  very  well  at  X ,  and  much  of  the  loss  occurs  within  the  feature 
extraction  and  at  the  application  of  the  decision  rule.  The  loss  at  link  y  appears  to  be  the 
dominant  information  limiting  component  in  the  system.  There  is  a  loss  of  .3-. 4  Bits  at  the 
feature  extraction  function  atY .  The  infonnation  loss  associated  with  signature  measurement 
and  signature  processing  results  in  only  .1  Bits  of  loss.  This  is  very  important  infonnation  in 
achieving  optimization  of  the  system  design  for  information  sensing.  Little  gain  can  be  expected 
through  the  expansion  of  sensing  degrees  of freedom  (DOF)  in  improving  the  overall 
performance  of  the  system. 
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Table  VI 

Information  Loss  Budget  for  Various  Trades 


System 

Component 

Information  Loss,  Bits  | 

Trade  1 
SNR 

Trade  2 
DR 

Trade  3 

BW 

Source-to- 

Measurement  ( X  ) 

0.1 

0.1 

0.05 

Measurement  — to- 
Feature  ( Y ) 

0.4 

0.3 

0.4 

Decision  Rule 
Application  (Q) 

0.1 

0.2 

0.1 

Total  Channel 

Loss* 

0.6 

0.6 

0.55 

♦Baseline  Conditions;  SNR=20  dB,  BW=800  MHz,  DR=20  dB 


Also,  the  loss  due  to  the  decision  component  of  the  system  is  in  the  range  of .  1  -  .2  Bits. 
Depending  on  the  performance  requirements  of  the  system,  improvements  to  the  decision  stage 
of  the  system  may  or  may  not  be  warranted.  Prior  to  the  decision  stage  of  the  system,  .4  -  .5  bits 
of  cumulative  loss  have  been  sustained  resulting  in  an  “upper  bound”  in  performance  of 
something  in  the  area  of  Pe  =.  1 .  No  improvements  to  the  classifier  design  within  the  decision 
component  of  the  system  can  improve  upon  this  performance  level.  Improvements  appear  to  be 
best  directed  toward  the  feature  extraction  stage. 


An  optimal  design  operating  point  may  for  example  include  the  following  component 
selections;  (i)  A/D  converter  with  B=4  Bits,  (ii)  Receiver  design  which  achieves  20  dB  SNR 
under  tactically  significant  conditions,  and  (iii)  Transmit  wavefonn  with  BW>  800  MHz. 

5.3.2  Information  Flow  and  System  Uncertainty 

The  study  of  the  effects  of  sources  of  uncertainty  on  system  performance  confidence  while 
under  control  parameters  vc  and  in  the  presence  of  sensing  uncertainty  ( VE .  V,  >  is  of  particular 

interest.  For  a  fully  sampled  signature  process  with  negligible  sampling  uncertainty  per  (46),  the 
FBIT  method  can  be  applied  to  study  the  independent  sources  of  uncertainty.  The  effects  of  each 
independent  source  of  uncertainty  can  be  studied  at  each  link  in  the  channel.  Equation  (36)  is 

62 


Approved  for  public  release;  distribution  unlimited. 


demonstrated  for  links  X  Y ,  and  Q  under  case  5  conditions  defined  in  Table  IV.  Under  these 
conditions,  three  independent  sources  of  uncertainty  are  introduced  in  the  system  under  perfect 
training  conditions.  An  unknown  bias  in  target  aspect  estimation  and  an  unknown  bias  in  leading 
edge  range  bias  estimation  are  assumed.  The  target  range  is  also  unknown  and  as  such  a  third 
uncertainty  in  introduced  in  the  SNR  of  the  measured  signature.  All  assumed  statistics 
associated  with  the  uncertainties  are  as  defined  under  case  5  of  Table  IV  and  as  described  in 
section  IV.D. 


Using  Monte-Carlo  simulation  L  independent  draws  of  an  Nm  sample  ensemble  from  x  are 
generated.  The  FBIT  method  is  applied  at  each  draw  to  generate  the  decomposition  of  the 
performance  estimate  reliability  in  (36)  at  x  Y ,  and  Q.  In  Fig.  23  the  cumulative  link  loss 
standard  deviation  defined  in  (24. a),  (24. b),  and  (24. c)  resulting  from  the  sum  of  the  independent 
three  uncertainty  sources  is  computed  about  the  expected  link  information  loss  defined  in  (23. a), 
(23  .b),  and  (23. c).  To  clearly  illustrate  the  level  of  agreement  of  the  independent  link  loss 
contributions  to  the  total  produced  by  the  joint  simulation,  the  individual  contributions  to  the 
cumulative  link  loss  variance  are  individually  plotted  in  an  incremental  fashion  in  Fig.  23.  Fig. 
23  shows  that  the  sum  of  the  independent  uncertainty  sources  yields  the  same  results  as  the 
Monte-Carlo  simulation  involving  all  three  factors  in  a  joint  process. 
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Fig.  23.  Cumulative  Channel  Link  Loss  Variance,  L=100,  N=3*105 
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The  corresponding  impacts  to  the  reliability  in  link  performance  can  be  generated  through  the 
application  of  Corollary  IV  and  V.  The  reliability  in  predicted  link  performance  as  quantified 
by  Definition  VI  resulting  from  the  sum  of  the  independent  three  uncertainty  sources  is 
presented  in  the  error  bars  about  the  expected  link  performance  defined  in  (3 1  .a),  (3 1  .b),  and 
(3 1  .c).  The  dashed  line  represents  the  results  of  the  joint  Monte-Carlo  simulation  where  all  three 
independent  uncertainty  factors  are  simulated  simultaneously.  The  results  in  Fig.  24  show 
clearly  that  the  sum  of  the  independent  events  equals  the  joint  event,  thus  validating  the 
assumption  of  independence  in  the  three  sources  of  uncertainty  acting  on  the  predicted 
performance  risk. 


In  Fig.  25  a  similar  validation  of  the  propagation  of  independent  uncertainty  sources  is  given 
for  the  reliability  in  predicted  performance.  The  example  demonstrates  that  the  use  of  Corollary 
IV  and  V  to  approximate  the  reliability  on  the  perfonnance  estimate  using  the  link  loss  variance 
is  a  very  effective  means  to  address  the  transcendental  relationship  underlying  this  method.  The 
data  points  marked  with  the  asterisks  represent  the  sum  of  the  independent  contributions  to  the 
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reliability  in  performance  prediction.  The  respective  plotted  lines  represent  the  results  of  direct 
simulation  at  the  specified  link. 
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Fig.  25.  Reliability  in  Predicted  Link  Performance  Given  as  Variance,  L=50,  N=3*105 


The  implications  of  imperfect  training  are  realized  in  the  final  stage  of  the  channel  at  Q’  as 
shown  in  Fig.  24.  At  Q’,  Case  6  conditions  in  Table  IV  are  used  to  present  a  naive  training 
approach  as  developed  in  section  IV.D. 


A  summary  of  the  expected  link  loss,  expected  link  performance,  reliability  in  link 
performance,  and  results  of  respective  sampling  uncertainty  tests  in  Fig.  18  are  given  in  Table 
VII  below.  The  reliability  in  predicted  performance  decreases  as  information  propagates  down 
the  sensing  channel.  The  expected  link  performance  also  decreases  in  accordance  with  the 
principles  of  mutual  information  and  the  Data  Processing  Inequality.  Much  of  the  decrease  in 
reliability  and  loss  in  predicted  performance  and  loss  in  performance  comes  at  the  feature 
extraction  stage  in  the  system.  The  reduced  reliability  in  performance  prediction  is  most 
sensitive  to  the  uncertainty  factor  of  SNR.  The  effects  of  the  factors  associated  with  target  range 
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bias  and  pose  estimate  bias  are  of  less  significance  relative  to  the  total  reliability  in  predicted 
perfonnance. 


From  Table  VII  it  can  be  seen  that  gains  in  performance  due  to  component  design  trades  must 
also  take  into  account  the  reliability  level  associated  with  predicted  performance. 

Table  VII 

Information  Confidence  &  Loss  Budget  for  Various 


L 

i 

n 

k 

Link  Information  Measure  j 

Link 

Loss, 

Bits 

Expected 

Link 

Performance 

Reliability  in 
Link 

Performance 

Sampling 

Uncertainty 

Test 

H 

0.0 

— - 

X 

Ail 

r  ilSA 

=.05 

Ap.  -013 

ex 

a  a  =  .003 

PeX 

/  2  k 

<J-v 

<J2k 

<.001 

Y 

II 

r  ilfa 

=  35 

Ape  =  -07-5 

®Y 

cr  v  =  .0228 

Pe 

(  2  N 

<.003 

Q 

JUn 
r  ilda 

=  16 

='12 

CfpQ  =  -0255 

<.006 

Q’ 

AilT4 

=.04 

AP  =-125 

eQ' 

<7  Q.  =.0266 

— 

In  this  example  problem,  changes  within  two  significant  digits  of  the  expected  perfonnance 
should  be  studied  in  the  context  of  the  reliability  of  the  performance  estimates  based  on 
uncertainty  factors  introduced  in  the  system. 
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6  CONCLUSION 


The  FBIT  method  is  developed  for  use  in  the  research  of  information  sensing  applications. 
Measures  are  developed  to  identify  infonnation  flow  bottlenecks  and  to  form  an  information  link 
budget  for  system  analysis.  Techniques  for  bounding  asymptotic  performance  under  sufficient 
sampling  are  characterized.  Test  criteria  are  developed  for  controlling  sampling  uncertainty 
within  the  uncertainty  propagation  analysis  approach.  Test  criteria  are  linked  to  phase  transitions 
within  the  typical  set  trajectory  associated  with  the  entropy  estimation  of  high  dimensional 
signature  processes.  The  FBIT  method  and  test  criteria  are  applied  to  an  HRR  radar  numerical 
example.  The  propagating  effects  of  various  sensing  uncertainties  on  system  perfonnance  are 
characterized  at  the  component  level. 
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